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ABSTRACT 

We present an algorithm to identify the type of an SN spectrum and to determine its redshift 
and age. This algorithm, based on the correlation techniques of Tonry & Davis, is implemented 
in the Supernova Identification (SNID) code. It is used by members of ongoing high-redshift SN 
searches to distinguish between type la and type Ib/c SNe, and to identify "pecuhar" SNe la. We 
develop a diagnostic to quantify the quality of a correlation between the input and template spectra, 
which enables a formal evaluation of the associated redshift error. Furthermore, by comparing the 
correlation redshifts obtained using SNID with those determined from narrow lines in the SN host 
galaxy spectrum, we show that accurate redshifts (with a typical error cr^ < 0.01) can be determined 
for SNe la without a spectrum of the host galaxy. Last, the age of an input spectrum is determined 
with a typical accuracy ct ^ 3 days, shown here by using high-redshift SNe la with well-sampled light 
curves. The success of the correlation technique confirms the similarity of some SNe la at low and 
high redshifts. The SNID code, which will be made available to the community, can also be used for 
comparative studies of SN spectra, as well as comparisons between data and models. 
Subject headings: methods: data analysis — methods: statistical — supernovae: general 



1. INTRODUCTION 

Supernovae (SNe) play a major role in the recent re- 
vival of observational cosmology. It is through com- 
parison of high-redshift Type la supernov a (SN la) 
magnitudes with those at low-redshif t (Hamuv et al.l 
119961 iRiess et al.lll999t iJha et al.ll2006al) that two teams 
independently found the present rat e of the univer- 
sal expansion to be accelerating (Ri ess et al.l Il998al : 
iPerlmutter et al.]|1999f ). This astonishing result has been 
confirmed iii subsequent years out to redshift z < 1 
()Tonrv et al.ll2003HKMp et al.ll2003HBarris et al.ll2004f ). 
but also at higher redshifts where the universal expansion 
is in a decelerating phase (Riess ct al. 2004). Currently, 
two ongoing projects have the more ambitious goal to 
measure the equation-of-state parameter, w, of the "dark 
energy" that drives the expansion: the ESSENCE (Equa- 
tion of State: SupErNovae trace C osmic Expansion; 
iMiknaitis et al.l2007trWood-yasev e~al. 20071 and SNLS 
(SuperNova Legacy Survev: lAstier et al.l 120061 projects. 
Both teams have published their initial results, which 
indicate that w, if constant, is consistent with a cosmo- 
logical constant {w = —1). 

The success of these cosmological experiments de- 
pends, among other things, on the assurance that the 
supernovae in the sample are of the correct type, namely, 
SNe la. The classification of supernovae is based on 
the ir optical spectra around maximum light (for a review 
see iFilippenkol IT997t l. At high redshifts, obtaining suf- 
ficiently high signal-to- noise ratio (S/N) spectra of such 
objects requires 1 -2 hr integrations at 6 .5-10 m-class tele- 
scopes (see, e.g.. iMatheson et"aril2005[ ). and constitutes 
the limiting factor for these experiments. Recently, alter- 



native classification meth ods based on photometry alone 
have been suggested (iPoziianski. Maoz. &: Gal-YamI 
[200l iKuznetsova fc Connollvl 120071: iKunz et all 1200611 . 
in anticipation of the next generation of wide- 
field all-sky surveys optimized for the detection 
of tr ansient events (Dark Ene rgy Survey, Frieman] 
200l Pan-STARRS iKaiser et al.. 2002.: SKYMAPPER 



Schmidt et aP [20051: AL PACA. Tcrotts fc ConsorThl^ 
20061: LSST, iTvson fc Angel [20?m . Inclusion of super- 
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novae that are of a different type leads to biased cosmo- 
logical parameters ()IIomeieij|2005[ ). Exclusion of genuine 
SNe la from the sample leads to increased statistical er- 
rors on these same parameters. 

The secure classification of supernovae is a challenge at 
all redshifts, however. Even with high S/N spectra, the 
distinction between supernovae of different types (or be- 
tween subtypes within a given type) can pose problems. 
This points to the inadequacy of the present purely em- 
pirical SN classification scheme in establishing distinct 
classes of supernovae, whose observational properties can 
be directly traced back to an explosion mechanism and 
a progenitor system. The two major types of supernovae 
are defined based on the presence (Type II) or absence 
(Type I) of hydrogen in their spectra, a distinction that 
does not reflect the differences in their explosion mech- 
anisms and progenitors: through the thermonuclear dis- 
ruption of a carbon-oxygen white dwarf star (Type la), 
or through the collapse of the degenerate core of a mas- 
sive star (Types lb, Ic, and II). For the latter case, it 
is now thought that a there exists a continuity of events 
between the types Il^Ib— >Ic, corresponding to increas- 
ing mass loss of the outer envelope of the progenitor star 
prior to explosion (jChevalieil [20061 ) . SNe lib are an in- 
termediate case between Type II and Type lb, and illus- 
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trate the tendency of some supernovae to "evolve" from 
one type to another. SNe Ic supernovae are only de- 
fined by the absence of elements ( hydrogen and helium; 
although see lElmhamdi et "aI1l2006l for the presence of hy- 
drogen in SNe Ib/c) in their atmospheres and thus form 
a heterogeneous class — which inc ludes the supernovae 
associated with gamm a-ray bursts (jKulkarni et al.llT998t 
iMatheson et al.l 120031 ) . The classification scheme is fur- 
ther complicated by "peculiar" sub-classes of events as- 
sociated with the four types (la, lb, Ic, and II). Nonethe- 
less, this classification scheme provides a means to keep 
track of general spectroscopic properties associated with 
the many supernovae discovered each year (more than 
550 in 2006 according to the International Astronomical 
Union^) and is useful for comparative studies of super- 
novae with similar characteristics. 

The spectrum of a supernova also contains informa- 
tion on its redshift and age ( defined as the number of 
days from maximum light in a given filter). Knowledge 
of the SN redshift is necessary for the use of SNe la as 
distance indicators (although see iBarris fc Tonrvl |2004| 
for redshift-independent distances), and is usually deter- 
mined using narrow lines in the spectrum of the host 
galaxy. When such a spectrum is unavailable, one has to 
rely on comparison with SN template spectra for deter- 
mining the redshift, although we note that Wang (2007) 
has recently presented a purely photometric redshift es- 
timator for SNe la, albeit with 3-5 times larger errors. 
The SN age is usually determined (to within 1 day) using 
a well-sampled light curve, but a single spectrum can also 
provide a good e stimate (to within 2-3 days for SNe la; 
ikiess et al.lll997[ ). since the relative strengths and wave- 
length location of spectral features evolve significantly 
on the timescale of days. Knowledge of the age of the 
supernova and its apparent magnitude and color at a 
single epo ch can yield a distance measurement accurate 
to ~ 10% (Ri ess et aLl UggSb). Moreover, comparison of 
spectral and light-curve ages of high-redshift supernovae 
can be used to test the expected time-dilation factor of 

1 + z), where z is the redshift. in an expanding universe 

Riess et al.lll997l : iFolev et al.ll2005| ). 

We have developed a tool (Supernova Identification 
[SNID]) to determine the type, redshift, and age of a su- 
pernova, using a single spectrum. The algorithm i s base d 
on the correlation techniques of iTonrv fc David (|1979D . 
and relies on the comparison of an input spectrum with 
a database of high-S/N template spectra. Fundamental 
to the success of the correlation technique is its appli- 
cation to objects that have counterparts in the template 
database. We briefly describe the cross-correlation tech- 
nique in the next section, before presenting the algorithm 
for determining the redshift (§|31)- We then briefly com- 
ment on the composition of our spectral database (§ |4]), 
before testing the accuracy of correlation redshifts and 
ages using SNID in §[5l Last, in §|6l we tackle the issue of 
supernova classification by focusing on specific examples, 
some of which are particularly relevant to SN searches at 
high redshifts. 

2. CROSS-CORRELATION FORMALISM 

The cross-correlation me thod presented i n this section 
is extensively discussed bv lTonrv fc Davij (|1979( ). where 



it is exclusively applied to galaxy spectra. We reproduce 
part of this discussion here to highlight the specificity of 
determining supernova (as opposed to galaxy) redshifts. 

Sections 12.11 and 12.21 are rather technical, while § 12.31 
presents the more practical aspects of spectrum pre- 
processing necessary for the cross-correlation method. 

2.1. A Few Definitions 

The correlation technique is straightforward: a super- 
nova spectrum s{n) whose redshift Zs is to be found 
is cross-correlated with a template spectrum (of known 
type and age) t{n) at zero redshift. We want to deter- 
mine the (1 + z'g) wavelength scaling that maximizes the 
cross-correlation c{n) — s{n) -kt{n), where ★ denotes the 
cross-correlation product. In practice, it is convenient to 
bin the spectra on a logarithmic wavelength axis. Multi- 
plying the wavelength axis of t{n) by a factor (1 -I- z) is 
equivalent to adding a ln(l -I- z) shift to the logarithmic 
wavelength axis of t{n), i.e. a (velocity) redshift corre- 
sponds to a uniform linear shift. Supposing we bin s(n) 
and t{n) into N bins in the range [Iq, li], each wavelength 
coordinate /in,™ is given by 

Vn-^o e"^"^''", (1) 

where dli^ — \ti{Ii/Iq)/N is the size of a logarithmic 
wavelength bin, and assuming n runs from to A'^. We 
then have: 

n = AlnZin,„ + S, (2) 

where A = A^/ln(/i//o) and B = -iV In/o/ hi (/i//o)- In 
what follows we assume that s{n) and t{n) have been 
normalized such that their mean is zero (see § 12. 3p . 

For computational ease and for pre-processing pur- 
poses, the cross-correlation is computed in Fourier space. 
Let S{k) and T(fc) be the discrete Fourier transforms of 
the supernova and template spectra, respectively (fc is 
the wavenumber): 

N-l 

S, T{k) - ^ s, t{n) e-2-»fc/Af . (3) 

Let as and at be the rms of the supernova and template 
spectrum, respectively: 

1 

<* = ]^E^'^W'- (4) 
The normalized correlation function c(n) is defined as 



c(n) = s(n) -kt{n) = — s{m)t{m — n), (5) 

^ m— 

such that if the supernova spectrum is the same as the 
template spectrum, but shifted by 8 logarithmic wave- 
length bins — i.e. s{n) = t(n — (5), then c((5) = 1. The 
Fourier transform of the correlation function is 



= S{k)T{k), (6) 



|http: //www. cf a .harvard ■ edu/lau/lists/Supernovae .html 



where T(fc) denotes the complex conjugate of T(fc). 
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2.2. Cross-correlation Redshifts 

Following iTonrv fc D;wii (|1979D we assume that s{n) 
is some multiple a of t{n), but shifted by (5 logarithmic 
wavelength bins: 

s(n) = at{n - S). (7) 

Unlike iTonrv fc David (|1979[ ). however, we do not need 
to assume that t(n — S) is further co nvolved with a broad- 
ening symmetric function [b{n) in iTonrv &: Daviill979l . 
their eq. 6] that accounts for galaxy stellar velocity dis- 
persions and spectrograph resolutions. While there ex- 
ists a velocity dispersion residual in c(n) primarily due 
to differences in the dynamics of the expanding envelope 
for different supernovae, this residual carries important 
information on the age of the supernova, which we also 
want to determine (see §[5]), and on the specificity of s{n), 
which is important for more general comparative super- 
nova studies. Second, the nominal Doppler width of a su- 
pernova spectral "feature" is ^1-2 orders of magnitude 
greater than the resolution of a typical low-resolution 
spectrograph. 

To estimate a and S, we need to minimize the following 
expression: 



JV-l 



x'ia,S)=y\atin-S)-s{n)]' 



n=0 



(8) 



=> x^{a, S) = a^Nal - 2aNas(Ttc{5) + Nal, (9) 

using eqs. H] and [5l We then obtain the condition for 
minimizing with respect to a: 



oa 



- asatc{S)] = 0, (10) 
from which we derive amin satisfying the above: 



amin = — c((5). 



(11) 



Substituting this value for a back into eq. [H we obtain 
a new expression for x^: 

x'(amin,(5)=7Va2[l-c(5)2]. (12) 

As expected, minimizing x^ is equivalent to maximizing 
the normalized correlation function c{6). 

Thus, the input supernova spectrum s{n) is cross- 
correlated with a template spectrum t{n), and a 
smooth function (he re a fourth-order polynomial, as in 
iTonry fc DavislfT979l ) is fitted to the highest peak in c(n), 
whose height and center determine a and 5, respectively. 
The cross-correlation redshift is then trivially computed 
as 



1, 



(13) 



where dli^ is the logarithmic wavelength bin defined in 
eq. [TJ The width of the peak is a measure of the error in 
z'g and is of the order of the typical width of a supernova 
spectral feature, modulated by the signal-to-noise ratio 
of the input spectrum (see S 13. 4|) . 

It is important to note that we assume the noise per 
pixel to be constant in the input spectrum. This is 
clearly not the case for ground-based optical spectra. 



where sharp emission features from the sky background 
leads to increased nois e at specific wavelengths. Recently, 
ISaunders et al.l (|2004[ ) found that scaling the input spec- 
trum by the inverse- variance yielded a dramatic improve- 
ment in the derived cross- correlation redshifts; specifi- 
cally, ISaunders et al.l (|2004f l rewrite eq. M as 



Af-l 



x'ia,s)^Y. 



at{n — 5) — s{n) 



(14) 



where cr{n) is the noise per pixel of s{n), and find that 
this is equivalent to simply scaling s{n) by l/a{n)^. 

This modification is well suited for determining galaxy 
redshifts, since sharp features in the variance spectrum 
(due to sky noise) have widths comparable to galaxy lines 
and hence will affect the Fourier transform of the corre- 
lation function, C(fc), at similar wavenumbers. However, 
we have found no such improvement for determining su- 
pernova redshifts. This is expected since supernova spec- 
tra consist of overlapping Doppler-broadened lines whose 
widths (~ 10000 km s~^) are 1-2 orders of magnitude 
greater than sky emission features. However, noise from 
an underlying galaxy continuum can yield power at sim- 
ilar wavenumbers as C(fc) and can significantly degrade 
the redshift accuracy when the fraction of galaxy light in 
the supernova spectrum is high (see §0. 

2.3. Pre-processing the Supernova Spectrum 

As already mentioned, the input and template spec- 
tra are binned on a common logarithmic wavelength 
scale, characterized by {lo,li,N) (eq. [T]). We show 
the result of mapping an input supernova spectrum 
onto a logarithmic wavelength axis with {lo,li,N) — 
(2500A, 10000 A, 1024) in Fig. ^,b. The size of a 
logarithmic wavelength bin in this case is dkn = 
ln(10000/2500)/1024 « 0.0014, from eq. [T] So a shift 
by one bin in In I space corresponds to a velocity shift 
of dlinC ~ 400 km s~^. This is 1 order of magnitude less 
than the typical width of a supernova spectral feature, 
Doppler-broadened by the ~ 10000 km s~^expansion ve- 
locity of the SN ejecta. 

The next step in preparing the spectra for correlatio n 
analysis is continuum removal (|Tonrv fc David 119791 ). 
For galaxy spectra, the continuum is well defined and 
is easily removed using a least-squares polynomial fit. In 
supernova spectra, however, the apparent continuum is 
ill-defined due to the domi nation of bound-bound tran- 
sitions in the total opacity (jPinto fc Eastmanll2000l l . Di- 
viding out a 13-point cubic spline fit to the spectrum 
(over the entire 2500-10000 A wavelength interval) is 
akin to removing a pseudo continuum from the super- 
nova spectrum. We then subtract 1 from the result- 
ing spectrum and apply a normalization constant for the 
mean flux to equal zero (Fig. [T]c). This effectively dis- 
cards any spectral color information (including reddening 
uncertainties and flux mis-calibrations), and the corre- 
lation only relies on the relative shape and strength of 
spectral features in the input and template spectra. We 
note that continuum division is also used bv lJefferv et al.l 
(2006) for measuring the goodncss-of-fit between super- 
nova spectra. We see below (§ [SJ that the loss of color 
information has surprisingly little impact on the red- 
shift and age determination. Continuum removal also 
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Fig. 1. — Pre-processing the spectrum for SNID. (a) Spectrum 
result of mapping the spectrum to InZ coordinates, with {lo,h,N) 
and the result normalized to zero mean flux, {d) A bandpass filter 

minimizes discontinuities at each end of the spectrum, 
which would cause artificial peaks in the correlation func- 
tion. Further discontinuities are removed by apodizing 
the spectra with a cosine bell (~ 5% at either end). 

The final step is the application of a bandpass filter. 
While it is actually applied at a later stage, directly to 
the correlation function, we show its effect on the in- 
put spectrum in Fig. [T]ff. The goal is to remove low- 
frequency residuals left over from the pseudo-continuum 
removal and high-frequency noise components. Formally, 
the Fourier transform of the normalized correlation func- 
tion, C{k) (eq. [11), is multiphed by a real bandpass func- 
tion (so that no phase shifts are introduced) B{k), such 
that 



B{k) 



1 



( k-ki 

\k2-ki 

( k-k3 



for fc < fci or fc > ^4 
for ki < k < k2 
for k2 < k < k^ 
for k^ < k < kA 



(15) 

The exact choices for the wavenumbers (fci, A:2, ^3, ^4) 
depend on the size of each k bin and on the spectral 
energy distribution of a supernova spectrum. Super- 
nova spectral lines have typical widths of ^100-150 A, 
due to the large expansion velocities of the ejecta (~ 



600 TOO 

of the SN la SN 20031j at z = 0.417 IIMatheson et al.|[2005l) . (b) The 
= (2500 A, 10000 A, 1024). (c) A 13-point spline has been divided out 
with (fci, k2, fca, k4) = (1, 4, 25, 102) has been applied to the spectrum. 

10000 km s~^). The mean size of a logarithmic wave- 
length bin with {lo,li,N) = (2500 A, 10000 A, 1024) is 
~ 7.2 A, so a typical SN line will have a width wn^e ~ 
14-21 logarithmic wavelength bins. In Fourier space, 
most of the information will be at wavenumbers k = 
N/ {2tt X uJiino) ~ 8-12. Since SN spectra consist of over- 
lapping spectral lines {Baron et al. 1996), a typical SN 
feature may have a lower width (< 50 A). This trans- 
lates to fc ~ 25, so most information is at wavenumbers 
less than 25 and almost everything above wavenumber 
fc ^ 50 is noise. Also, low wavenumbers (fc < 5) contain 
information about the low-frequency residuals from con- 
tinuum removal. In Fig. [2] we show the amplitude of the 
Fourier transform of typical unfiltered correlation func- 
tions as a function of wavenumber. As expected, most 
of the correlation power is at wave numbers k ~ 10-20, 
and virtually no information is contained in wavenum- 
bers k > 50. 

3. REDSHIFT ESTIMATE 

In this section we introduce the correlation height-noise 
ratio r (§ 13. ip and the spectrum overlap (lap) parameter 
(§ 13. 2p . the product of which (the rlap quality parame- 
ter) conveys quantitative information about the reliabil- 
ity of a cross-correlation redshift output by SNID. We 
then briefly describe the redshift estimation (§ 13. 3p and 
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Fig. 2. — Normalized amplitude of the Fourier transform of 
typical unfiltered cross-correlation functions vs wavenumber, k. 
Note how most of the power is concentrated at low wavenumbers 
(fe < 50), justifying our choice for the bandpass filter (dashed line). 
In this example the maximum correlation amplitude is at fc = 13, 
corresponding to a wavelength scale of ~ 90A. 
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Fig. 3. — Correlation height-noise ratio, r, is defined as the ratio 
of the height, h, of the highest peak in the normalized correlation 
function (solid line) to the rms of its antisymmetric component, 
a(n) (dashed line), about the redshift corresponding to that peak 
(zpeak)- The width of the peak, w, is used to compute the red- 
shift error (see text for details). [See the electronic version of the 
Journal for a color version of this figure.] 



associated error fij 13. 4p . 

3.1. The r-value 

iTonrv &: Davis! ((1979) introduce the correlation height- 
noise ratio, r, to quantify the significance of a peak in 
the normahzed correlation function, c(n). It is defined 
as the ratio of the height, h, of the peak to the rms 
of the antisymmetric component of c(n), aa, about the 
correlation redshift (Fig. [3]): 



V2aa 



(16) 



In order to compute an . iTonrv fc Davii (|1979D assume 
that c(n) is the sum of an auto-correlation of a template 
spectrum t{n) with a shifted template spectrum t{n — 
S) and of a random function a{n) that can distort the 
correlation peak: 
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Fig. 4. — Examples of perfect (left), good (middle), and poor 
(right) normalized correlation functions (solid line). The antisym- 
metric component of the correlation function about the SNID red- 
shift (^snid) is also shown (dashed line). [See the electronic version 
of the Journal for a color version of this figure.] 

c{n) =t{n) -kt{n - 5) + a{n). (17) 

The first term on the right-hand side of eq. [17] is sup- 
posed to give a correlation peak of height ft, = 1 at the 
exact redshift (corresponding to a shift S in logarithmic 
wavelength units), while the second part can distort the 
peak. Since t ★ t{n — S) is symmetric about n = S, the 
antisymmetric part of c(n) about n — 6 equals the anti- 
symmetric part of a(n) about n = 6. We further assume 
that the symmetric part of a(n) has roughly the same 
amplitude as its antisymmetric part and that the sym- 
metric and antisymmetric parts of a{n) are uncorrelated. 
In that case, the rms of a{n) is \/2 times the rms of its 
antisymmetric component. 

A perfect correlation will have a peak with h = 1 at the 
exact redshift, and c(n) will be symmetric about n = 5, 
thus CTa = and so r — > oo (Fig. 01 left panel). Con- 
versely, r will be small (r < 5) for a spurious correlation 
peak (Fig. m right panel), and large (r > 10) for a signif- 
icant peak, since h will be close to 1 and aa will be small 
(Fig.m middle panel). 

3.2. Spectrum Overlap 

In SNID, the correlation height-noise ratio r alone does 
not provide the estimator by which a correlation peak is 
deemed reliable. It is further weighted by the overlap in 
In I space between the input spectrum and each of the 
template spectra used in the correlation. In practice, the 
template spectra are trimmed to match the wavelength 
range of the input spectrum at the redshift corresponding 
to the correlation peak. For an input spectrum with rest- 
frame wavelength range [/oj^i]j the overlap in In^ space, 
lap, with each template spectrum is in the range 



< lap < In 



(18) 



Thus for an input and template spectra both overlapping 
the rest-frame wavelength interval 3500-6000 A, lap = 
ln(6000/3500) « 0.54. 

The spectrum overlap parameter conveys important 
absolute information about the quality of the correla- 
tion, complementary to the correlation height-noise ra- 
tio r. Supposing a typical SN la spectral feature has an 
FWHM of a; « 200A at ; « 5000 A, any correlation 
with lap < In (5400/5000) « 0.08 wiU be meaningless: 
any feature will match any other at practically any red- 
shift. Only when a correlation has an associated lap that 
is several times In (A//Z) can one rely on the redshift out- 
put by SNID. 
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Fig. 5. — Top; Contours of equal lap = ln(ii/Zo) for different 
rest-frame wavelengtfi ranges [^Oi'i] of overlap between input and 
template spectra. We usually discard correlations with lap < 0.4. 
Bottom: Contours of equal rlap = r X lap for a broad range of val- 
ues for the correlation height-noise ratio (r) and spectrum overlap 
parameter (lap). We usually discard correlations with rlap < 5 
(and lap < 0.4). 

In what follows, we usually discard correlation redshifts 
that have an associated lap < lapj^^j^ — 0.4 and a quality 
parameter rlap = r x lap < rlapjjjj„ = 5. In Fig. [5] we 
show contour plots for both the lap and rlap parameters, 
central to the use of SNID. 

3.3. Initial and Revised Redshift Estimates 

For each template spectrum ti{n), we compute the cor- 
relation function Ci{n) = s{n) -ktiin). In general, Ci{n) 
has many peaks in redshift space (Fig. l3l4p . The true red- 
shift is most likely the one corresponding to the highest 
peak in Ci(n), although in poor signal-to-noise ratio cases 
some peaks can distort or surpass the true redshift peak 
(Fig. m right panel). In practice, SNID selects the 10 
highest peaks (labeled with index j) in Ci{n) one by one 
and performs a fit with a smooth function to determine 
the peak height and position, hij and 5ij^ respectively. 
The corresponding redshift is Zij = e:xjp{dijdlia) — 1. The 
wavelength regions of s{n) and ti{n) that do not over- 
lap at Zij are trimmed, and, if the resulting spectrum 
overlap lap > lapj^^j^, a new "trimmed" correlation func- 
tion, Cij{n), is computed, and the corresponding corre- 
lation height- noise ratio (r^j), spectrum overlap (lapjj). 



and redshift (zij) are stored. 

Once all templates have been cross-correlated with the 
input spectrum, SNID computes an initial redshift, Zinit, 
based on an rlap- weighted median of all Zy . Each red- 
shift Zij is replicated Wij times according to the following 
weighting scheme: 
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If all rlapjj < 4, Zinit is set to 0. 

SNID then computes a revised redshift based on the 
initial estimate, Zinit- The input and template spectra 
(again labeled i) are trimmed such that their wavelength 
coverage coincides at Zinit- If the resulting spectrum over- 
lap lapj > lapjjjjjj, a second trimmed correlation function 
is computed and the correlation height-noise ratio (r^), 
spectrum overlap (lapj, and redshift (zi) corresponding 
to the highest correlation peak are stored. The width Wi 
of the correlation peak is also saved and is used to com- 
pute the redshift error (see next section). The revised 
redshift, zsnid, is computed as the non-rlap- weighted 
median of all redshifts Zi that satisfy rlapj > rlapj,jjjj 
with lapi > lapj^jjj, with the additional requirement that 
the individual redshifts Zi do not differ significantly from 
the initial redshift estimate: \zi — Zinitl < ^fiit, where 
Zfiit = 0.02, typically. 

3.4. Redshift Error 

One of the advantages of using the cross-correlation 
technique for redshift deter mination is the ability to esti- 
mate the redshift error, e^. iTonrv fc David (|1979f ) derive 
a formal expression for ez based on the idea that spuri- 
ous peaks (positive and negative) in the antisymmetric 
component, a(n), of the correlation function, c(n), can 
distort the true correlation peak. Obviously, is pro- 
portional to the number of peaks in a(n), and hence to 
the mean distance between peaks. Assuming c(n) and 
a{n) have similar power spectra, the mean distance be- 
tween a peak in c(n) and the nearest peak in a(n) can be 
estimated as N/8B (jTonrv fc Davis 1979, their eq. 22), 
where N is the total number of bins and B is the highest 
wavenumber at the half-maximum point of the Fourier 
transform of c( n) (B » 25 here; see Fig. [2]). One can 
then show that (jTonrv fc Davislll979L their eq. 24) 



— X 



1 



(20) 



where — N/8B (w 5 here) and r is the correlation 
height-noise ratio defined in eq. [1^1 With the additional 
assum ption of sinusoidal noise in c(n), iKurtz fc Ming 
(ll998[ ) find kz — 3w/8, where w is the width of the cor- 
relation peak. 

In practice, kz is calibrated using additional red- 
shift measurements, either through a different tech- 
niq ue (e.g., 21cm meas urements for galaxy redshifts 
m ITonrv fc Da^ \M§) or using the same cross- 
correlation technique on two spectra of the same object 
(jKurtz fc M ink 1998). For supernova spectra, additional 
redshift information potentially comes from narrow emis- 
sion and absorption lines in the host galaxy, while dupli- 
cate spectra of the same supernova (at the same age) 
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are not comraon (Table [TJ . We find that including the 
spectrum overlap parameter (lap) yields a more robust 
estimator of the redshift error (see also 15. 



1 + rlap ' 



(21) 



with kz ~ 2-4w. 



4. THE SNID DATABASE 

4.1. Nomenclature and Age Distribution 

The current SNID spectral database comprises 879 
spectra of 65 SNe la, 322 spectra of 19 SNe Ib/c, and 
353 spectra of 10 SNe II (Table p. The spectra are 
drawn from pubhc archives (SUSPECT^ and the CfA 
Super nova Archive*^) and fro m the CfA Supernova Pro- 
gram (jMatheson et al.|[2007D . The spectra are chosen to 
have high signal-to-noise ratio (typically > 10 per A) and 
to span a sufficiently large optical rest frame wavelength 
range (/,„i„ < 4000 A; Z^ax > 6500 A) to include all the 
identifying features of SN spectra. We remove telluric 
features in all the spectra, either using the well-exposed 
continua of spcctrophotomctric standar d stars for th e 
CfA data (Wade & Home 1988; Mat heson et al.ll200GD . 
or using a simple linear interpolation over the strong A- 
and B-bands. We show the full suite of spectra for the 
local SN la spectral template SN 1992A (jKirshner et all 
|1993() . which also includes some UV data from the Hubble 
Space Telescope at some epochs, shifted to zero redshift 
in Fig. El 

While we have included all the supernovae available 
to us for which there are a large number of epochs of 
spectroscopy, there are still many more (> 1000) super- 
novae for which there are only one to two epochs of spec- 
troscopy that we have yet to include in the database. 
We also include spectra of galaxies, active galactic nu- 
clei, stars (including variable stars, such as luminous blue 
variables), and novae. This can be particularly useful 
when trying to weed out contaminants fro m large surve ys 
of high-redshift supernovae (cf. Mathcso n"erall 120051 ). 

We show the age distribution of SNID supernova tem- 
plates for the main supernova types (la, lb, Ic, II) in 
Fig. [71 For each type, we show the age distribution of 
"normal" representatives of that type, as well as spectra 
that show deviations from the latter (in the "other" cate- 
gory) . We note that this division is somewhat qualitative 
and relies on the identification by eye of cert ain charac- 
teri stic spectroscopic features in the spectra (jFilippenkol 
1199 7). We are currently working on a statistical scheme 
to separate our template spectra in these various cate- 
gories (see also Fig. [H). The nomenclature for the dif- 
ferent supernova types and their associated subtypes is 
given in Table [H From Fig. [7l it is clear that the age 
distribution of the SNID templates is not uniform, and 
even bi-modal for SNe la. This potentially introduces 
age "attractors" that could in principle bias the age and 
redshift determination (although see § [5]) . The fact that 
there are more SN la templates than SNe lb, Ic, and 
II combined also leads to a type "attractor," with the 
risk for low-S/N spectra to be preferentially classified as 
SN la, regardless of their type (see § |6]) . 

^ ,http: //bruf ord.nhn. ou.edu/^sus pect/indexl .h tml | 

^ |http : //www ■ cf a ■ harvard ■ edu/supe rnova/SMarch ive . html | 
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~ 3500 A are due to a calibration mismatch between the UV and 
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Fig. 7. — Age distributions of SNID templates for supernovae of 
different types. The number of supernovae corresponding to a given 
type is indicated in square brackets. Note the larger ordinate range 
for SNe la. For SNe II, the age is given in days from the estimated 
date of explosion as opposed to days from maximum in a specific 
band. The transitional SNe lib are included in both Type lb and 
Type II histograms. [See the electronic version of the Journal for 
a color version of this figure.] 
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TABLE 2 
Supernova types and subtypes 



Type 


la 


lb 


Ic 


II 


"normal" 


la-norm 


Ib-norm 


Ic-norm 


Il-norm (IIP) 




ia-pec 


ib-pec 


ic-pec 


li-pec 


"other" 


Ia-91T 


lib 


Ic-broad 


IIL 




Ia-91bg 






Iln 










lib 



Note. — "norm" and "pec" refer to "normal" and "peculiar" sub- 
types of the corresponding type; see Table^for specific examples. "Ic- 
broad" is used to identify broad- lined SNe Ic ( "hypernovac" ) , some of 
which arc associated with Gamma-Ray Bursts. The transitional Type 
lib supcrnovac arc included in both Type lb and Type II categories. 



The execution time of SNID scales linearly with 
the number of templates^ and is remarkably low com- 
pared with y^-based m ethods (see § [T]) — although see 
iRvbicki fc PressI ()1995[ ) for fast statistical methods that 
can compete with the cross-correlation technique. It is 
trivial to include large spectroscopic data sets — such 
as those from the CfA Supernova Group (for example, 
431 spectra of 32 SNe la, included in t he present SNIP 
datab ase, will soon be published by iMatheson et al.l 
I2OOI . 

4.2. Intrinsic Spectral Variance 

In Fig. [5] we show the standard and maximum de- 
viation from the mean spectrum of all "la-norm" tem- 
plates at —10, -1-0, -1-10, and +20 days from maximum 
light. One clearly sees the rapid variation of SN spectra 
around maximum light, but also the change in intrin- 
sic scatter with age. For instance, the intrinsic spread 
in the strength and position of the defining Si 11 16355 
feature (which causes the deep blueshifted absorption 
around ^ 6100 A) decreases between —10 to -1-10 days 
from maximum light. At -1-20 days, the scatter is large 
in that wavelength region due to increasing contribution 
from other ions (mainly Fe il). 

The residual variation about the mean spectrum 
(Fig. m right panel) shows that normal SNe la are typi- 
cally within 10%-20% from the mean spectrum, although 
deviations greater than 40% are seen at certain wave- 
length intervals (again depending on the age). The fact 
that all la-norm template spectra are within two stan- 
dard deviations from a mean spectrum suggests a pos- 
sible statistical classification scheme to differentiate nor- 
mal SNe la from the other la subtypes. With more data, 
it is in principle possible to do this more reliably for 
SNe la, as well as other supernova types. 

The intrinsic variation of the la-norm templates points 
to the inadequacy of describing a given SN subtype with 
a single representative template, unless the latter in- 
cludes this variance explicitly. Past attempts to create 
gri ds of such template spectra, such as those presented 
bv lNugent et al.l (|2002f l. do not account for the variabil- 
ity within a given SN type at a given age. We show the 
corresponding Nugent template (ver. 1.2) at each age 
in Fig. [8] {dashed line). While most of the Nugent tem- 
plate is included within the standard deviation from the 
mean spectrum in our database, there are also significant 
deviations. We do note, however, that the comparison 

* Execution time texec ~ 6s(cpu/2.86GHz)(A'^tcmp/1000), where 
A'^tomp is the number of templates. 



is somewhat misleading since iNugent et al.l ()2002[ ) had 
less data available to them for the elaboration of these 
templates. Nevertheless, we have tested their use in the 
SNID spectral database, but have found them to lead to 
systematic errors in both the redshift and age determi- 
nation. 

5. ACCURACY OF REDSHIFT AND AGE 
DETERMINATION 

We use a simple simulation to test the accuracy of 
SNID in determining the redshift and age of a supernova 
spectrum. Here we focus on normal SNe la since they are 
the most represented in our spectral database, although 
the conclusions of this section are qualitatively valid for 
all other supernova types. Even though normal SNe la 
form a homogeneous class, the spectra reveal intrinsic 
variations at any given age that affect directly the red- 
shift and age determination. The redshift precision de- 
pends primarily on the typical width of a spectral feature 
(decreasing from broad-lined SNe Ic to SNe Iln), which 
affects the width of the correlation peak (see Fig. [3]). 
The redshift accuracy depends primarily on the intrin- 
sic variation of line positions at a given age. The age 
determination strongly correlates with the redshift de- 
termination 15. 4p . and depends on how quickly the SN 
spectra evolve at a given age. 

5.1. Presentation of the Simulation 

In this simulation, each la-norm spectrum in the SNID 
database (cf. Table [T]) is correlated with all other la- 
norm spectra, except for those corresponding to the input 
supernova (to ensure unbiased results). We require all 
spectra used in the simulation to include the rest-frame 
wavelength interval 3700-6500 Aand to have an age (in 
days from _B-band maximum, hereafter is) — 10 < < 
-1-20. 

We show the simulation parameters in Table [TJ The 
input spectrum is first redshifted to z by simply multi- 
plying the wavelength axis by (1 + z). We then "contam- 
inate" the input supernova spectrum with galaxy light 
(up to 50% of the t otal flux), using the elliptical and Sc 
galaxy templates of lKinnev et al.l (|l996[ ). and add noise 
(both random Poisson noise and sky background) to re- 
produce the range of typical signal-to-noise ratio of SN 
spectra at the simulation redshifts, when observed with 
6.5-10 m-class telescopes (e.g., VLT, Keck, Gemini, Mag- 
ellan) used in cosmological SN la surveys. Note that we 
do not scale the input spectral flux to match a given 
simulation redshift, as SNID normalizes the input and 
template spectra in a similar fashion (Fig. [1]) . 

We restrict the observed wavelength range over which 
SNID computes the correlation to 4000 < Zobs[A] < 9000, 
to mimic the coverage of the FORSl optical spectrograph 
mounted on the VLT. We have not studied the impact of 
a change in this wavelength range on the redshift or age 
determination. Furthermore, we force SNID to only con- 
sider correlation redshifts in the interval [0, 1]. For each 
correlation, we record the template name, type, subtype, 
and age; the correlation redshift and its associated cor- 
relation height-noise ratio (r) and spectrum overlap pa- 
rameter (lap); and the width w of the correlation peak 
(to estimate the redshift error). 

To study the effects of constraints on redshift and age, 
we run SNID three times on the input spectrum: once 
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Fig. 8. — Left: Standard {light gray) and maximum {dark gray) deviation from the mean spectrum for all la-norm templates, at four 
different ages. We overplot the cor responding Nugent template at each age {dashed line). All spectra have been pre-processed in the same 
way as any SNID template (ij 12.31 1. Right: Fractional difference from the mean la-norm spectrum. We also show the ratio of the mean 
spectrum to the corresponding Nugent template {dashed line). [See the electronic version of the Journal for a color version of this figure.] 



TABLE 3 
Simulation parameters 



Parameter Range 

Redshift, z 0.1 < z < 0.7 

Galaxy contamination fraction, /^^i < /gai < 0.50 

Signal-to-noise ratio, S/N (per 2A) 1 < S/N < 15 

Age (days from B-band maximum), tg —10 < < -f20 

Minimum rest frame wavelength coverage, ircst (A) 3700 < itest < 6500 

Observed wavelength range, l^hs (A) 4000 < ^obs ^ 9000 



with no constraints; a second time with a flat constraint 
on redshift (±0.01), and a third time with a flat con- 
straint on age (±3 days). We note that the distribution of 
redshift residuals is remarkably Gaussian (§[52]), and we 
are currently implementing Gaussian priors in SNID. A 
total of 4 billion correlations were computed with SNID 
for this simulation, in just under 70 CPU hr. 

5.2. Redshift Residuals and Redshift Error 

We show the distribution of redshift residuals, Az, ver- 
sus the rlap quality parameter in the top-right panel of 
Fig. [51 for input parameters 0.3 < z < 0.5, —5 < ts < 
-fl5, and 2 < S/N (per 2A)< 10. The residuals are 
shown as a two-dimensional (2D) histogram, with a lin- 
ear gray-scale scheme reflecting the number of points in 
a given (Az, rlap) bin. We only show correlations for 
which the overlap between input and template spectra 
lap > 0.4. For good correlations (rlap > 5), the dis- 
tribution of redshift residuals is a Gaussian centered at 
Az = 0. In the bottom right panel, we show the stan- 



dard deviation of redshift residuals, cr^, in rlap bins of 
size unity. For rlap ^ 5, we have a typical error in red- 
shift of order ^ 0.01. 

For poor correlations (rlap < 3) there is a concentra- 
tion of points around Az sa —0.01. This is an artifact 
of the pseMiio-continuum removal, which enhances the 
contrast between emission peaks and absorption troughs 
in the input and template spectra and biases poor cor- 
relations to later ages. In this simulation, many input 
spectra at maximum are attracted to ^ +10 days, where 
the position of SN spectral features has shifted redward 
in wavelength due to the expansion of the supernova en- 
velope (Fig. [T0|) . The template needs to be shifted less 
in In I space to match the redshift of the input spectrum, 
which leads to an under-estimation of the redshift by 
^ 0.01. This corresponds to a combination of the typical 
velocity shift in SN la absorption features from maximum 
to ~ 10 days past maximum, and the spre ad of these ve- 
lociti es at a given age (~ 30 00 km s~^; see lBenetti et al.l 
120051 : iBlondin et aITl2006a^ . We note that this artifact 
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Fig. 9. — Top: 2D histograms of redshift residuals vs. the 
correlation height-noise ratio r {left panel) and the rlap quality 
parameter (with lap > 0.4; right panel), with the following param- 
eters: 0.3 < 2 < 0.5, -5 < tB < +15, 2 < S/N (per 2A)< 10. 
The linear gray scale reflects the number of points in a given 2D 
bin (the more points the darker). Bottom: Standard deviation, 
(Tz, of redshift residuals in r, rlap bins of size unity. For the <Tz(r) 
curve {filled circles, bottom left), we show the effect of additionally 
requiring that lap > 0.4 {open circles). 



5900 



6000 



A [A] 

6100 



6200 



6300 



+ 




-20 -15 -10 -5 

cl(X-Xo)/^ [lO^km/s] 

Fig. 10. — Evolution of the blueshifted Si II 16355 absorption 
profile in the SN la SN 1994D (Hoflich 1995; Patat et al. 1996) 
between —11 and +7 days from B-band maximum. The dotted line 
shows the velocity location of the locus of maximum absorption, 
''abs • We highlight the Si II profile at maximum light ( dashed line) . 
Over the course of 18 days, the locus of maximum absorption shifts 
redward in wavelength by ~ 80 A, corresponding to ~ 4000 km s~ ^ 
in velocity. Note the more rapid evolution of Dabs before maximum 
light. [See the electronic version of the Journal for a color version 
of this figure.] 

has no impact on correlations with rlap > 3. 

In the left panels of Fig. [HI we show the same distri- 
bution of Az, this time only as a function of the corre- 
lation height-noise ratio, r (note the change in the ab- 
scissa range). To first order, the 2D histogram of redshift 
residuals looks remarkably similar to that as a function 
of rlap (Fig.[9l right panels), with again a concentration 
of points around Az « —0.01 for low values of r. How- 
ever, the variation of az with r {bottom left panel, filled 
circles) gives a different picture: the lack of constraint 
on lap causes in some cases a mis-estimate of the red- 
shift, at all r, thereby greatly biasing to higher values 
{<Jz > 0.03, for all r). Requiring that lap > 0.4 leads to 
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Fig. 11. — Same as Fig. [9] except the abscissae now correspond to 
(1 -I- r, rlap)/ui, where w is the width of the correlation peak. A fit 
to the binned CTz distributio ns {bottom) yields the value for k used 
in estimating the error (eg. I21I I. In both panels, only correlations 
with lap > 0.4 are shown. 

a significant improvement {open circles), with cr^ < 0.01 
for r > 10. It is therefore imperative to consider the 
overlap between the input and template spectra to yield 
accurate supernova redshifts with the cross-correlation 
technique. 

The formal redshift error, e^, is proportional to w/{l + 
rlap) (eq. [2T|) . w being the width of the correlation peak 
(Fig. [3]). We illustrate the determination of the constant 
of proportionality, fc^, in Fig. Illi where we show the 
same 2D histograms of redshift residuals Az, this time 
as a function of (1 + r)/w {left panels) and (1 -I- rlap)/?x; 
{right panels). A best fit to the curves in the bottom 
panels yields a value for kz'. 5.3 for {l + r)/w, and 3.1 for 
(l+rlap)/^. Only correlations with lap > 0.4 are shown. 
As in Fig. [51 the product of the r- value and the overlap 
yields a more robust error estimator than the r-value 
alone. In what follows we study variations of redshift 
and age determinations using SNID only as a function of 
the rlap quality parameter, with lap > 0.4. 

In principle, kz needs to be evaluated for every tem- 
plate spectrum in the database, through either inter- 
nal or exter nal comparisons (as done for galaxy spectra l 
templates in iTonrv fc Diwiill979t iKurtz &: MinklllQQSl ). 
While this is impractical for supernova spectra — there 
are few duplicate spectra of the same supernova at a 
given age (Table [T|), we have computed kz using subsets 
of templates used in our simulation, as well as for other 
supernova types, and have found that kz is typically in 
the range 2 < fcx < 4, with fc^ ss 3 being the median 
value. 

The above holds for a single spectral template; to use 
SNID to its full capacity, we need to combine redshifts 
for all templates for which the rlap quality parameter is 
greater than a certain cutoff (generally, rlap > rlapj,,j,-| = 
5). In S 13.31 we favored the non-rlap- weighted median 
of all correlation redshifts with rlap > rlapj^^j^^ as be- 
ing "the" SNID redshift, but did not justify this. In 
Fig. [12] (top panels), we show distributions of SNID red- 
shift residuals, when the SNID redshift is taken to be 
the redshift corresponding to the highest rlap > 5 value 
("best;" left panel); the median {middle panel) or rlap- 
wcighted mean {right panel) of all redshifts with rlap > 5. 
Both the median and mean distributions are consistent 
with a Gaussian distribution, with the median redshift 
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Fig. 12. — Top: Normalized distributions of redshift residu- 
als, when the SNID redshift is assumed to be the redshift of the 
best-match template (left), the median redshift (middle), and the 
rlap- weighted mean redshift (right). The distribution of median 
redshifts is the most consistent with a normal distribution. Bot- 
tom: Normalized distributions of the ratio of the absolute redshift 
residual (corresponding to the different redshift estimators in the 
top panel) to the redshift error, e^, estimated in different ways (see 
text for details). 

providing a slightly better match. This is expected since 
the use of the median redshift guards us from system- 
atic errors produced by spurious or ill-defined correlation 
peaks for some templates. The distribution of "best" red- 
shift residuals is broader and non-uniform. We therefore 
consider the median redshift to provide the best estimate. 

In the bottom panels of Fig. [12] we show the normal- 
ized distributions of the ratio of the absolute redshift 
residual (corresponding to the different redshift estima- 
tors in the top panel) to the redshift error, ez, estimated 
in different ways. A ratio equal to or above unity in- 
dicates that the actual redshift is consistent with the 
SNID redshift within the estimated error, while a ratio 
below unity indicates that the error is under-estimated. 
For a good error estimator, we expect those distribu- 
tions to peak at a ratio near unity, with a long tail to 
higher ratios and a sharp drop below unity. Such is the 
case for the formal redshift error (eq. [21]) associated with 
the "best" redshift {left panel). It is not obvious which 
error to associate with the median redshift. We found 
that the standard deviation of all correlation redshifts 
with rlap > 5 provided a satisfactory estimate of the 
error (middle panel). This same estimator was used by 
iMatheson et al.l (|2005[ ) for high-z SN la spectra from the 
ESSENCE survey. The error in the rlap-weighted mean 
[right panel), on the other hand, systematically underes- 
timates the true redshift error by a factor of ^ 3. 

5.3. Age Residuals 

Unlike redshift, the supernova age is not (and cannot 
be) a free parameter in SNID, as it is a discrete vari- 
able tied in with a specific spectral template. Neverthe- 
less, since the cross-correlation technique relies solely on 
the relative strengths and position of broad spectroscopic 
features, which themselves are a strong function of the 
supernova age (Figs. 16181 fc[TU|). we expect a strong cor- 
relation between the rlap quality parameter and the age 
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Fig. 13. — Top: 2D histogram of age residuals vs. the rlap 
quality parameter (with lap > 0.4), with the same parameters as 
in Fig. |9] The gray scale reflects the number of points in a given 
2D bin (the more points the darker). Bottom: Standard deviation, 
at , of age residuals in rlap bins of size unity. For rlap 6, crt < 5 
days. 

residual. At, between input and template spectra. 

We show the distribution of age residuals versus rlap 
in Fig. [13] {top panel), where the gray scale has the same 
meaning as in the previous 2D histograms. For rlap > 6, 
the distribution of age residuals is a Gaussian centered 
at At = 0. In the bottom panel, we show the standard 
deviation of age residuals, at, in rlap bins of size unity. 
For rlap > 6, we have a typical error in age of order 
(It < 5 days. 

The most striking feature in the gray scale of Fig. [13] 
is the near absence of points around At = for low 
values of rlap. For poor correlations, the age is sys- 
tematically mis-estimated, with a tendency to overesti- 
mate the age by ~ 10 days. Again, this is an artifact 
of the pseudo-continuum removal, which causes many 
maximum- light spectra to correlate with ~ -I- 10-day tem- 
plates (see § 15. 2p . It is also due to the nature of the 
supernova evolution, as the spectra evolve more rapidly 
around maximum light than they do around 10 days past 
maximum (Fig. [H]), increasing the likelihood of correla- 
tions with templates at these ages. 

There is no formal estimator for the age error. We have 
examined the distribution of age residuals above a certain 
rlap cutoff (cf. Fig. [12] for redshift) and find the median 
age of all templates with rlap > 5 to be a good estimate 
of the spectral age. However, the standard deviation of 
all template ages with rlap > 5 tends to systematically 
overestimate the age error by ~ 20%. 

5.4. Covariance Between Redshift and Age 

The determination of redshift and age is intrinsically 
connected, and in principle one should marginalize over 
one parameter to infer the other. Marginalizing out 
the redshift (a continuous variable) to infer the age is 
straightforward, but the reverse is more complex, as it 
involves marginalization over sparsely sampled variables. 
The techniques to do this abound in the Bayesian litera- 
ture, but we have yet to implement them in SNID. Nev- 
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Fig. 14. — Age residuals vs. redshift residuals, illustrating the 
covariance between the two quantities. We show the la {solid 
line) and 2(t (dashed line) contours. The parameters are the same 
as those used in Fig. |9] with the requirement that rlap > 5. 



ertheless, we illustrate the covariance between redshift 
and age using the 2D histogram of age versus redshift 
residuals, for correlations satisfying rlap > 5 (Fig. [HI) . 
As expected (see ? 15. 2p . over(under)-estimating the age 
leads to under (over)-estimating the redshift, since the 
loci of maximum absorption shift to the red with age 
(Fig[10l). 

The anti-correlation between redshift and age resid- 
uals shown in Fig. [T3] suggests that constraints on one 
parameter should improve the accuracy of the other. 
Fig, [m shows the effect on the distributions of redshift 
{left panel) and age {right panel) residuals (for rlap > 5; 
open histograms) of adding a flat ±3-day constraint on 
age and a flat ±0.01 constraint on redshift, respectively 
{hatched histograms) . A constraint on the age leads to a 
~ 30% narrower distribution of redshift residuals (from 
<Jz — 0.006 to <Jz = 0.004) and a constraint on redshift 
improves the age determination by ~ 15% {at = 3.4 days 
to at = 2.9 days). In practice, the constraint on redshift 
generally comes from a spectrum of the SN host galaxy 
and one can impose a constraint on age using a well- 
sampled light curve of the supernova (one for which the 
date of maximum light is easily determined). 

The age distribution of SN spectral templates in the 
database affects the accuracy of both cross-correlation 
age and redshifts. In Fig. [16] we show the result of a 
Monte Carlo simulation where we compute the number 
of SN la spectra in bins of 3 days that would be suf- 
ficient for accurate redshift {left panel) and age {right 
panel) determinations with SNID. The solid histogram is 
the actual age distribution of normal SN la templates in 
the interval —10 < ts < 4^20 and the dotted histogram is 
the Monte Carlo distribution. We compute 1000 Monte 
Carlo realizations for each unity increment in the number 
of spectra in a given 3-day age bin. The number of spec- 
tra was chosen such that adding more spectra would not 
change the mean redshift and age residuals (for rlap > 5) 
by more than 0.0001 and 0.1 days, respectively. For this 
Monte Carlo distribution, at least eight correlations with 
rlap > 5 are needed for the median redshift and associ- 
ated error (Fig. [TS]) to provide an accurate estimate. 

The Monte Carlo age distribution for redshift determi- 
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Fig. 15. — Effect of age and redshift constraints on redshift ileff) 
and age (right panel) residuals, respectively, with the same param- 
eters as in Fig. [9] Here ^SNID (iSNlo) corresponds to the median 
of all redshifts (ages) with rlap > 5. The open and hatched his- 
tograms correspond to residuals with no constraint and a constraint 
on (z,t), respectively. [See the electronic version of the Journal for 
a color version of this figure.] 
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Fig. 16. — Actual (solid line) and Monte Carlo (dotted line) age 
distributions of normal SN la templates for redshift (left panel) and 
age (right panel) determination. The Monte Carlo distribution 
was computed such that adding more spectra would not change 
the mean redshift and age residuals (for rlap > 5) by more than 
0.0001 and 0.1 days, respectively. [See the electronic version of the 
Journal for a color version of this figure.] 

nation (Fig. 1161 left panel) has an initial peak around —10 
days and a bell-shaped envelope roughly centered around 
maximum (0 days) — akin in fact to a supernova light 
curve. This is due to the faster evolution of supernova 
spectra around maximum light than around 1-2 weeks 
past maximum (Figs.[8]fc [T0|) . In other words, SNID can 
accurately determine the redshift of an input spectrum 
at +10 days using a template at -1-15 days, since the 
wavelength (velocity) positions and relative strengths of 
spectral features change little over this age interval, but 
will be less accurate when an input spectrum at max- 
imum light is correlated with template spectra at -1-5 
days, since the evolution of the spectra is more signifi- 
cant then. The initial peak around —10 days is due to the 
rapid decrease in spectr al line blueshifts from 5, ~10 days 
to ^ - 5 days (Fig.[10l Benetti et aLll2005l : iBlondin et all 
[20063), rather than to a change in the relative strengths 
of spectral features. We are currently lacking normal 
SN la spectral templates around -1-5 days past maxi- 
mum light. This gap will be filled shortly with a new 
set of spectra from the CfA Supernova Program (almost 
50 SNe la with more than 10 epochs of spectroscopy since 
2000). 
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The Monte Carlo age distribution for age determina- 
tion (Fig. [m right panel) is altogether different, but the 
same reasons apply: due to the rapid evolution of SN 
spectra around maximum light, it is easier to accurately 
determine the age then than at 1-2 weeks past maximum, 
where the spectra evolve on longer timescales. Hence, 
more spectra are needed at later ages than around max- 
imum light. The current number of normal SN la tem- 
plates in our database is sufficient for accurate age de- 
terminations out to t < +15 days, but we need twice the 
number of templates in the last age bin. Again, this is 
within reach with the new set of CfA spectra. 

To minimize the impact of our currently non-optimal 
age distribution of SN la templates, we have studied 
the redshift and age residual distribution when impos- 
ing a l/iVtemp(i)'' weighting scheme, where iVtcmp(0 is 
the template age distribution (in 3-day bins) and 0.0 < 
a < 2.0 (a = 0.5 corresponds to a Poisson-like weighting 
scheme). This way, the artificial attractors in the actual 
age distribution around maximum light and -1-10 days are 
down-weighted with respect to templates at other ages. 
The weighting scheme does not lead to any improvement 
in either the redshift or age determinations, namely the 
distribution of residuals for rlap > 5 does not get any 
narrower. Clearly, a more elaborate method is necessary 
to break the redshift-age degeneracy and the covariance 
between the two quantities will be included explicitly in 
a future version of SNID. 

Fig. [16] only shows the age distribution of normal SN la 
templates, for which we have sufficient spectra in the 
database to construct a viable Monte Carlo simulation. 
The faster evolution of supernova spectra around max- 
imum light is common to all supernova types, but the 
homogeneity at a given age may vary significantly. We 
are not in a position to test this thoroughly, due to the 
limited number of Type Ib/c and Type II templates in 
the current SNID database. Again, we are confident that 
new data from the CfA Supernova Program will better 
constrain the variance of SN spectra at a given age (al- 
most 20 SNe Ib/Ic/II with more than 10 epochs of spec- 
troscopy since 2000). Therefore, while the overall shape 
of the Monte Carlo distributions should remain the same, 
the absolute scale should be different for the various su- 
pernova types. 

5.5. Variation of Redshift and Age Accuracy with 
Redshift, Age, S/N and Galaxy Contamination 

The previous studies are valid for the following param- 
eter space: 0.3 < 2 < 0.5; -5 < is < +15; 2 < S/N (per 
2A)< 10. However, we expect the accuracy of cross- 
correlation redshifts and ages to change with redshift, 
age and signal-to-noise ratio of the input spectrum. 

In top panels of the left group of panels of Figure [T71 we 
show the variation of the standard deviation of redshift 
residuals, az, with the rlap quality parameter for vary- 
ing redshift {left), age (middle), and S/N (right). We 
expect the degrading accuracy with redshift, since the 
rest-frame overlap (lap) between the input and template 
spectra in our database decreases with redshift. Even 
requiring that lap > 0.4 can lead to degenerate redshifts 
at the higher end of the redshift range (z > 0.5). In- 
creasing the number of spectra extending blueward to 
/ > 2000 A would partially alleviate this problem, al- 
though the ffux is strongly depleted at these wavelengths 



(due to line-blanketing from iron-group elements) and 
the most prominent features in supernova spectra are 
at optical wavelengths. UV spectra of (nearby) super- 
novae are rare, but the database could be expanded at 
these wavelengths (for SNe la, at least) by including the 
higher-S/N publicly-available spectra of ongoing high-z 
SN sear ches, such as the spe ctra from the ESSENCE 
project (jMatheson et al.ll2005[ ). 

The variation of (Jz(rlap) with S/N of the input spec- 
trum fFig. ll7[ left group of panels; top right panel) is also 
expected, with a significant degradation below S/N < 3 
per 2 A. The degradation with increasing age of the input 
spectrum is again due to the slower evolution of SN spec- 
tra at later ages. The input spectrum will correlate well 
with template spectra over a larger range of ages, where 
the scatter in the velocity location of spectral features 
will translate directly into an error in redshift. 

In the bottom panels of the left group of panels of 
Figure [17] we show the effect of applying a flat ±3 day 
age constraint on the (Tz(rlap) curves. The improvement 
is significant in all cases (although less so for z = 0.7). 

In the top panels of the right group of panels Figure [17] 
we show the variation of the standard deviation of age 
residuals, at, with the rlap quality parameter for vary- 
ing redshift (left), age (middle), and S/N (right). Again, 
the degradation at the highest redshift (z = 0.7) is ex- 
pected, although it is surprising that the at (rlap) curves 
for z = 0.3 and z — 0.5 lie atop the one correspond- 
ing to z = 0.1. It appears that the optimal rest frame 
wavelength range of the input spectrum is different for 
redshift and age determination. This has already been 
mentioned by iFolev et aLl (|2005D concerning the age de- 
termination and points towards the need for an age- and 
wavelength-dependent lap(t, I) parameter to weight the 
correlation height-noise ratio, r, instead of the constant 
lap currently implemented in SNID. The difficulty of de- 
termining the age of an input spectrum at later times is 
due to the less rapid evolution of the spectra at these 
ages. At high values of rlap (> 7), however, at decreases 
with age. This behavior is unexpected, given the discus- 
sion in § 15.41 and could again be due to the wavelength- 
independent nature of our lap parameter. 

Even more surprising is the apparent independence of 
(Tt(rlap) on S/N: for fixed redshift and age (here z = 0.5 
and —5 < < -1-5), the rlap quality parameter gives an 
absolute measure of the age accuracy, regardless of the 
S/N of the input spectrum. Of course, the probability 
of having correlations with high rlap values drops with 
S/N, but we have checked that our simulation yielded 
a sufficient number of correlations at rlap > 7 for this 
result to be statistically significant. 

We next study the impact of contamination from the 
underlying spectrum of the host galaxy affecting the in- 
put supernova spectrum. The contamination fraction 
will depend both on the projected position of the super- 
nova within its host (higher contamination closer to the 
nucleus) and on the relative fiux difference between the 
supernova and the portion of the galaxy located in the 
same aperture (i.e. immediately underlying the SN trace, 
when extracting the spectrum). Several techniques are 
commonly used to separate the supernova light from that 
of the host galaxy, either through galaxy t emplate sub- 
traction (e.g., in the algorithm presented bv lHowell et al.l 
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Fig. 17. — Left: Variation of CTz with redshift (top left), age (in days; top middle) and S/N (per 2 A; top right). We siiow tlie effect of 
applying a ±3 day constraint on age in the bottom panels. Right: Same as the left group of panels, but for the variation of at with redshift 
(top left), age {top middle) and S/N {top right). We show the effect of applying a ±0.01 constraint on redshift in the bottom panels. [See 
the electronic version of the Journal for a color version of this figure.] 



or using more elaborate techniques such as two- 
chann el deconvolution dire ctly applied to the 2D spec- 
trum (jBlondin et al.l [20051 ). However, neither of these 
techniques works well in cases where the SN lies on top 
of the nucleus of a bright galaxy and in all other cases 
there still remains some fraction of galaxy light in the 
SN spectrum. 

We contaminate each input spectrum in our simula- 
tion with ga laxy light, using the elliptical and Sc galaxy 
templates of lKinnev et al.l (jl996( ) . The top panels of Fig- 
ure [TH] show the effect on redshift residuals, ^^(rlap), of 
increasing the galaxy contamination fraction (expressed 
in fractions of the total flux) from 0.00 to 0.50. The 
impact of the elliptical galaxy {top left panel) is most se- 
vere, since the spectra of early-type hosts contain broad 
continuum structures that yield strong power at simi- 
lar wavenumbers as supernova features in Fourier space. 
Late- type galaxies have smoother continua and their nar- 
row emission lines are filtered out using the bandpass 
filter. In the bottom panels of Fig. [18] we apply a flat 
age constraint of ±3 days. The improvement, if any, is 
negligible for both the elliptical and Sc galaxy types. 

We have run several simulations to test whether the 
rlap quality parameter could be used to evaluate the 
amount of galaxy contamination for various galaxy types, 
but the results were inconclusive. This constitutes the 
real limit of SNID: some extra pre-processing of the su- 
pernova spectrum is necessary to ensure that the input 
to SNID is as "clean" as possible. Other algorithms (see 
§ \3l perform a simultaneous fit of the galaxy fraction 
when comparing the input SN spectrum to the set of tem- 
plates in the database, which enables the classification 
of supernovae when the galaxy contamination is < 75% 
([Howell et al.l[2005h . 

5.6. Comparison with External Measurements 

In this section we test the accuracy of correlation red- 
shifts using SNID by comparing them with those of the 
host galaxy. Galaxy redshifts (^gal) are routinely de- 
termined using nebular emission lines in their spectra 
or by cross-cor relation with absorption-line galaxy spec- 
tral templates (i Kurtz fc Min k 1998). They are typically 
accurate to < 0.001. However, the redshift of the super- 
nova can differ from zqal, since the supernova event may 
have occurred in a region where its velocity (in the galaxy 
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Fig. 18. — Top: Impact of galaxy contamination fraction on 
crz{rlap), for both elliptical (E; left panel) and spiral (Sc; right 
panel) galaxies. The different lines correspond to different galaxy 
contamination fractions, expressed in fractions of the total flux. 
Bottom: Effect of applying a ±3-day constraint on the age. [See 
the electronic version of the Journal for a color version of this 
figure.] 

rest frame) is different from the mean value, due to the 
velocity dispersion of the galaxy's light-emitting com- 
ponent (~ 100 km s^^ and ~ 200 k m s~^ in e arly- and 
late- type galaxies, respectively; Mc Elrovlll995f) . Never- 
theless, zqal gives a more accurate determination of the 
SN redshift than SNID (which has typical redshift errors 
of < 0.01 for rlap > 5), so a comparison of the two gives 
a valuable indication on the accuracy of SNID redshifts, 
determined from real data. 

We have selected hi gh-redshift SN l a spectra taken by 
the ESSENCE team ([Matheson et al.i.2005 : Foley et al., 
in prep; Blondin et al., in prep), for which a redshift of 
the host galaxy was obtained. This amounts to 57 SN la 
spectra in the redshift range 0.164 < z < 0.782. The 
result of this comparison is shown in the left panels of 
Figure [1^1 The dispersion about the one-to-one corre- 
spondence of the redshifts is excellent, with ~ 0.005 
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over the whole redshift range. This is in good agreement 
with the expected redshift residual found from simula- 
tions (with no constraint on the age; Fig. [T5)) . The bot- 
tom left panel shows a plot of the redshift residuals as 
a function of the galaxy redshift. The mean residual is 
~ 4 X 10~^ ^ (T^, which shows that there are no system- 
atic effects in using SNID to determine the SN redshift. 

To compare the supernova age determined through 
cross-correlation with external measurements, we se- 
lect ESSENCE high-redshift SN la spectra for which a 
well-s ampled light curve is available around maximum 
light (jMiknaitis et al.l l2007f l . This way we can deter- 
mine the time difference (in the observer frame) be- 
tween maximum light, (tmax) and the time the spectrum 
was obtained (igpoc) and compare this time interval with 
the rest- frame age (^snid) determined through cross- 
correlation with local SN la templates. For this compari- 
son to make sense we must correct the light-curve age for 
the (1 -I- z) t ime-dilation factor expected in an expand- 
mg universe (Wilson" 1939; ' R^HqtI iLeibundgut eTall 
[199 6: Goldhaber ct al. 200 if T We expect a one-to-one 
correspondence between 



and isNiD- The result is shown in the right panels of 
Figure [HI We used a total of 54 spectra in the red- 
shift range 0.205 < z < 0.687, 27 of which had an as- 
sociated galaxy redshift — which we used as a constraint 
when determining the age. The dispersion about the 
isNiD ~ ^LC line is at ~ 2.9 days over a age interval 
~10 ^ ^LC ^ +20, again in good agreement with the 
expected residuals (Fig.[Tn|). We show the residuals ver- 
sus ^LC in the bottom right panel. The mean residual 
is approximately —0.7 days. The excellent correspon- 
dence between ^lc and ^snid shows that SNID can be 
used in studies of time di lation effects in high-redshift 
multi epoch SN la spectra (jRiess et al.lll997t iFolev et al] 
[20051) . 

The correlation technique could not have yielded such 
good results had the high-z SNe la in the sample been 
significantly different from the SN la template spectra in 
the SNID database. The fact that the correlation red- 
shifts and ages agree so well with the galaxy redshifts 
and light-curve ages, respectively, is a strong argument 
in favor of the similarity of these SNe la with local coun- 
terparts. 

6. TYPE DETERMINATION 

The results of § [5] are only valid if we assume that 
we know the type of the input supernova spectrum — 
in this case a normal SN la. Although SNID is tuned 
to determining SN redshifts, we investigate its potential 
in determining the SN type in an impartial way. We 
base our investigation on a simple frequentist approach 
as opposed to a more elaborate Bayesian one, but we 
discuss the future implementation of the latter in SNID 
in §[71 

In what follows we focus on five distinct examples, the 
first three being particularly relevant to ongoing high- 
redshift SN la searches: the distinction between 1991T- 
like SNe la and other SNe la 16. ip : the distinction be- 
tween SNe Ib/c and SNe la at high redshifts (§[0|); the 



identification of pecuhar SNe la 16. 3^ : finally, the dis- 
tinction between SNe lb and SNe Ic and between SNe lib 
and both SNe II and lb (§ 16. 4p . more relevant to ongo- 
ing nearby [z < 0.1) supernova searches. We used the 
same simulation setup as in § 15.11 except we consider 
correlations with all supernova types in the database. 

The reader must keep in mind that, while the age dis- 
tribution of SN la templates is close to optimal in the 
current SNID database (see § 15. 4p , those for supernovae 
of other types are most certainly not. While the results 
presented in this section are encouraging, they are no 
doubt biased by the relatively low number of SN Ib/Ic/II 
templates with respect to SNe la. 

6.1. Normal versus 1991T-like SNe la 

It can be challenging to distinguish the subtypes of 
SNe la from one another. 1991T-like SNe la have a 
peak luminosity at the bright end of the SN la dis- 
tribution and although their light curves still obey the 
width -peak luminosity, or "Phillips," relation ( Phillips! 
|1993( ). it is useful to have an independent confirma- 
tion of their high intrinsic luminosity from their spec- 
tra. Spectra of 1991T-like SNe la are characterized by 
the near-absence of Ca ii and Si ii lines in the early- 
time spectra, and prominent high-excitation features 
of Fe III — not found in normal SNe la. The Si ii, 
S Hand Ca II features develop during the post-maximum 
ages and by ~ 2 weeks past maximum the spectra 
of "1991T-l ike" objects are simi l ar to those of nor- 
mal SNe la (FilipDcnko ct al. 1992: Ruiz-Lap uente et all 
[1992; Phillips ct al. 1992; Jcffcry ct al. 1992). 

In the top panels of Figure[iniwe illustrate the ability of 
SNID to identify 1991T-like SNe la at z = 0.5. The input 
spectrum is a "Ia-91T" template in the SNID database 
(Table[T]) in the age interval —5 < is < -1-5, i.e. when the 
spectroscopic differences with normal SNe la are most 
apparent. We show the fraction of templates in the SNID 
database that correlate with the input spectrum, as a 
function of the rlap quality parameter: 1991T-like SNe la 
{solid line), other SNe la (dashed line), and supernovae of 
other types (dotted line). From left to right, we show the 
effect of having no constraint on either age or redshift, a 
flat ±0.01 constraint on redshift, a flat ±3 day constraint 
on the age, and a combined constraint on both redshift 
and age. 

When there is no constraint on either redshift or age, 
the fraction of Ia-91T templates {solid line) for rlap > 10 
is greater than that for other SN la templates {dashed 
line). Note that for rlap > 5 the confusion with super- 
novae of other types is practically non-existent (< 2%). 
Adding a constraint on the redshift does not lead to a 
significant improvement, but a constraint on age reduces 
the cross-over rlap value between other SN la and 1991T- 
like templates from rlap > 15 to rlap « 6. The negative 
noise spikes around rlap w 15 are statistical noise due to 
the small number of templates with rlap values in excess 
of that cutoff. 

The bottom panels of Figure [20| show the same lines 
for input spectra in the age interval -1-5 < is < +15. 
At these post-maximum ages, the differences between 
1991T-like and normal SNe la are less apparent (see the 
rightmost panel) and the impact on the ability of SNID 
to distinguish between the different la subtypes is read- 
ily apparent. In the absence of constraints on redshift 
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Fig. 19. — Le/i; Comparison of redshifts determined from cross-correlations with SN la spectral templates using SNID (zsnid) and 
from narrow lines in the host galaxy spectrum (zqal; top). We show the residuals vs. zgal in the bottom panel. Right: Comparison of 
supernova spectral ages determined using SNID (tsNlo) and rest-frame light-curve ages (^lc ) of high-z SNe la (0. 1 64 < z < 0.587; top) . 
We show the residuals vs. (lc in the bottom panel. The data are from the ESSENCE project jMatheson et al.|[2005l : IMiknaitis 61^120071 : 
Foley et al., in prep; Blondin et al., in prep). 
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Fig. 20. — Attempt to identify a 1991T-like SN la at z = 0.5 in the age interval — 5 < is < -f5. Top: Fraction of templates in the SNID 
database corresponding to a certain supernova type (1991T-like SN la: solid line; SN la of other subtypes: dashed line; supernova of other 
types: dotted line), in rlap bins of size unity. Left to right: With no constraints on the redshift or age, with a ±0.01 constraint on the 
redshift, with a ±3 days constraint on the age, and with a combined constraint on redshift and age. Bottom: Same lines as above, but for 
post-maximum spectra in the age interval -|-5 < tB < +15. The right panel shows representative spectra of 1991T-like and normal SNe la, 
around maximum light (top) and ~l-2 weeks past maximum (bottom), as observed with a typical optical spectrograph at z = 0.5 (the 
cross-hatched area represents the rest frame portion of the spectrum that is lost due to the redshift). Note that the relative differences in 
pseudo-continuum shapes have no impact on the SNID results. [See the electronic version of the Journal for a color version of this figure.] 



or age, the fraction of Ia-91T templates reaches a peak 
of ^ 70%, while it increases to 100% with constraints on 
redshift and age. The variation for rlap > 15 is again 
statistical noise due to the limited number of templates 
with such good correlations. 

The difficulty of distinguishing between normal and 
1991T-like SNe la at high redshift could partly explain 
the apparent lack of 1991T-like SNe la at high red- 
shifts (2/52 « 4% in the SN la sample published by 



iMatheson et al.l |2005[ ) with respect to the fraction ex - 
pected locally (up to ^ 20% according to lLi et al.l2001bf l. 

6.2. SN la versus SN Ib/c 

The misidentification of supernovac of other types as 
SNe la is a major concern for ongoing high-redshift SN la 
searches. Including only a small fraction of non-la su- 
pernovac in a sample would lead to a mis-calibration 
of the absolute magnitudes of these objects and to bi- 
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ases in the derived cosmological parameters ([Homeieij 
120051 ). A particular concern is the contamination of high- 
z SN la samples with SNe Ib/c. At redshifts z > 0.4, the 
defining Si ii Z6355 absorption feature of SNe la (also 
present, although somewhat weaker, in SNe Ic) is red- 
shifted out of the range of most optical spectrographs 
and one has to rely on spectral features blueward of this 
to determine the supernova type. Some of these fea- 
tures, such as the Ca ii H and K 113934, 3968 doublet, 
are common to both SNe la and SNe Ib/c. Other fea- 
tures characteristic of SN la spectra around maximum 
light (e.g. S II Z/5454, 5640) are generally weak and can 
be difficult to detect in low-S/N spectra. One has to 
invoke external constraints, such as the SN color evo- 
lution, light-curve shape, host galaxy morphology (onl y 
SNe la occur in early-type hosts; Cappellaro et al. i,1997D . 
or the expected apparent peak magnitude: SNe Ib/c at 
maximum light are ofte n > 1 m ag fainter than SNe la 
(jRichardson et al.ll2006t although [cTocchiatti et al]|2000l 
have reported on one SN Ic with an absolute magnitude 
similar to normal SNe la) and hence are only expected 
to "pollute" magnitude-limited samples of SNe la at the 
lower- redshift end. If the redshift is not known, SNe Ib/c 
can be a serious contaminant for high-redshift SN la 
searches. 

In the top panels of Figure [2T] we illustrate the ability 
for SNID to identify SNe Ib/c at z = 0.3. The input spec- 
trum is a "Ib-norm" or "Ic-norm" template in the SNID 
database (Table[T]) in the age interval —5 < ts < -1-5. We 
show the fraction of templates in the SNID database that 
correlate with the input spectrum, as a function of the 
rlap quality parameter: all SNe Ib/c (including SNe lib; 
solid line), SNe la (dashed line), and SNe II (excluding 
SNe lib; dotted line). 

In the absence of a constraint on age, correlations with 
rlap > 4 are sufficient to recover a dominant fraction of 
SNe Ib/c over SNe la. The confusion with SNe II is prac- 
tically nonexistant (< 5% for rlap < 3; 0% for rlap > 3). 
A constraint on age only reduced the cross-over rlap be- 
tween SNe la and SNe Ib/c from rlap w 4 to rlap « 3 but 
leads to less correlations with rlap > 5 (hence the noise 
spikes in the recovered template fractions). A combined 
constraint on redshift and age yields a 100% SN Ib/c 
fraction for rlap > 4, but no correlations with rlap > 8. 

At z = 0.5, the results are essentially unchanged from 
z — 0.3, although the absolute number of good corre- 
lations (rlap > 5) is generally less. We also note that 
these results are biased by the confusion between SNe lb 
and SNe Ic at this redshift (up to 40%). The con- 
straint on age leads to a more significant improvement 
than at z = 0.3, suggesting that the similarity between 
spectra of SNe Ic around maximum light and SNe la at 
~l-2 weeks past maximum is less problematic over this 
restricted wavelength range. 

Note that the mis-classification of SNe Ib/c as 
SNe la (or the reverse) can sometimes pose prob- 
lems with the high-S/N spectra of nearby objects, 
especially at later ages or if no age information is 
available. A striking example is th e nearby SN Ic 
SN 2004aw (|Taubenber ger et al .l |2006[). ori ginallv clas- 
sified as an SN la by Benetti et al.l (|200J ). More re- 
cently, two nearby supernovae originally announced as 
Type Ic events around maximum light (S N 2006bb 
and SN 2006bk; iKinugasa fc Yamaokall2006al lbl) were re- 



classified as SNe la at 2-3 weeks past maximum light 
based on cro ss-correlations with S N spectra of all types 
using SNID (|Blondin et al.ll2006bn 'l. 

6.3. Identifying SN la "Oddballs" 

Some SNe la, which we refer to as peculiar, do not be- 
long to any of the normal, 1991T-like, or 1991bg-like cat - 
egories. Such is the ca se of SNe 2000cx (" Li et al.ll20 01ah. 
2002CX (jLi et al.ll2003l ) and 2005hk (Philli^TeTSB 007). 
The first of these has pre-maximum spectra similar to 
those of SN 1991T, although the Si ii lines that appear 
around maximum light remain strong several weeks past 
maximum. SN 2002cx, "the m ost peculiar known SN la" 
(jLi et al.ll200l iBranch et all 2004; Jha et al. 2006b) and 
SN 2005hk are even more difficult to accommodate in 
the current classification scheme: their early-time spec- 
tra show signatures of high-ionization lines of iron, as in 
the overluminous SN 1991T, but their luminosity is sim- 
ilar to that of the subluminous SN 1991bg. Moreover, 
their /-band light curves are devoid of the secondary 
maximum present in all other la subtypes. Both objects 
possibly originate fr om a pure deflagration explosion (see 
iPhillips et al.|[2007l ) and could form an altogether sepa- 
rate class of SNe la. 

These peculiar SNe la do not obey the Phillips rela- 
tion and thus cannot be used as calibrated distance indi- 
cators. They must be weeded out of high-redshift SN la 
samples in order to avoid significant biases in the derived 
cosmological parameters. Until recently, there was no ev- 
idence for such peculiar events at high redshifts. This has 
changed with the recent discovery of the overluminous 
SNLS -03D3bb (SN 2003fg) at z = 0.244 (jHowell et al.l 

iooei). 

In the top panels of Figure [22] we illustrate the ability 
of SNID to identify peculiar SNe la at z = 0.3. The input 
spectrum is a "la-pec" template in the SNID database 
(Table[T]) in the age interval —5 < is < -1-5. We show the 
fraction of templates in the SNID database that correlate 
with the input spectrum, as a function of the rlap qual- 
ity parameter: peculiar SNe la (solid line), other SNe la 
[dashed line), and supernovae of other types [dotted line). 
In the absence of a constraint on redshift, the maximum 
recovered fraction of la-pec templates is ~ 5%, with no 
correlations at rlap > 6. With a constraint on redshift, 
the la-pec template fraction peaks at ~ 70%, but the best 
correlations are always for another la subtype (most fre- 
quently 1991T-like). At z = 0.5, the recovered fraction of 
la-pec templates is complete for rlap > 5 for both a con- 
straint on redshift and a combined constraint on redshift 
and age. This apparent improvement is counterbalanced 
by the absence of correlations with rlap > 9. Note that 
in the absence of any constraint, the recovered fraction 
of la-pec templates is dominant for rlap > 6. 

Despite the limited number of peculiar SNe la in our 
database (SN 2000cx, SN 2002cx and SN 2005hk, for a 
total of 15 spectra in the age interval —5 < is < -1-5; 
see Table [IJ , the change in template fraction from z — 
0.3 to z = 0.5 gives indications as to which rest frame 
wavelength range is most valuable for determining the 
SN type. This again calls for a wavelength (and age) 
weighting of the spectrum overlap parameter, lsLp[l,t) 
(see § I5.5|1 . which we are working to implement in a future 
version of SNID. 

We note that SNID unambiguously confirms the 
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Fig. 21. — Same as Fig. 1201 but for normal SNe Ic in the age interval —5 < ts < +15, at z = 0.3 (top) and z = 0.5 (bottom). Here the 
lines correspond to fractions of SNe Ib/c (solid line), SNe la of all subtypes (dashed line), and SNe II (excluding SNe lib, dashed line). 
[See the electronic version of the Journal for a color version of this figure.] 
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Fig. 22. — Same as Fig. 1201 but for a peculiar SN la in the age interval —5 < ts < +15, at z = 0.3 (top) and z = 0.5 (bottom). Here the 
lines correspond to fractions of peculiar SNe la (solid line), other SNe la (dashed line), and all other supernova types (dashed line). [See 
the electronic version of the Journal for a color version of this figure.] 



similarity between SN 2005hk and SN 2002cx (see 
iPhillips et all l2007t l: for all the spectra of SN 2005hk in 
the SNID database (Table [T]), the best-match template 
spectrum is SN 2002cx, whether a constraint on redshift 
and age is applied or not. In the absence of a constraint 
on redshift, however, the fraction of 1991T-like templates 
that correlate with the input SN 2005hk spectrum in- 
creases dramatically, leading to an overestimation of the 
redshift by ^ 0.02, roughly corresponding to the dif- 
ference in absorption velocities between S N 2005hk and 
SN 1991T at a given age (- 6000 km s'^ HPhillips et all 



[200l . 

SN LS-03D3bb (SN 2003fg) at z = 0.244 (jHowell et al.l 
l2006l ). on the other hand, illustrates the limitations of the 
cross-correlation approach in determining the SN type, 
when applied to objects that are not part of the library of 
spectral templates. Its spectrum (at +2 days; A. Howell, 
private communication) is unique among supernova spec- 
tra and we have no similar examples in our database. In 
the absence of constrai nts, the best-match tem plate is the 
1991T-like SN 1999dq (jMatheson et al.ll2007t ) at t = +6.2 
days [z — 0.254 + 0.005). With constraints on age or red- 
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shift, the best-match template is in all cases the normal 
SN la SN 1989B at t = +3.5 days (z = 0.251 ± 0.005). 
The spectrum of SNLS-03D3bb is now part of the SNID 
database and will prove useful to identify such peculiar 
objects at all redshifts. 

6.4. Further Specific Examples 

We next focus on two further specific examples, rele- 
vant to the spectroscopic classification of supernovae in 
nearby (z < 0.1) SN searches: the distinction between 
SN lb and Ic, and that between SN lib and Il/Ib. 

6.4.1. SN lb versus SN Ic 

SN lb and Ic are often difficult to tell apart and are 
sometimes referred to as "SNe Ib/c" in the literature 
(e.g., SN 1999ex; Hamuv et all l2002f l and lAU circu- 
lars. This difficulty is inherent to the SN classification 
scheme, rather than from a physical mis-conception of 
these events, both of which are believed to originate in 
the core collapse of a massive star, stripped of its outer 
layers through eithe r stellar winds or interact ion with 
a binary companion (jWooslev et al.lll993lll995h . SNe lb 
are defined by the presence of conspicuous lines of He i in 
their optical sp ectra, whereas SNe Ic are defined by their 
quasi-absence (jMatheson et al.ll200ll ). The Si ii 16355 
feature is weak in SNe Ib/c, which enables one to dif- 
ferentiate them from SNe la, at least in principle (see 
§ 16. 2p . Both are of Type I and are thus also defined by 
the absence of hydrogen lines in their spectra, although 
the case has recently been made f or some hydroge n being 
prese nt in SNe Ib/c ijBranch et al. 2006; Elm hamdi et al.l 
[20061) . 

In Fig. [23l we show the result of cross-correlating SN lb 
spectra at low redshift (z — 0.1) within 10 days from 
F-band maximum with SNe of all types in the SNID 
database: Type lb (excluding Type lib; solid line), Type 
Ic {dashed line), and other SN types [dotted line). The 
lines correspond to the fraction of template supernovae of 
a given type, as a function of the rlap quality parameter. 
We deliberately exclude the lib subtype from this anal- 
ysis, as this type is a hybrid between the II and lb sub- 
types (see below). The results of Fig.[23]are encouraging: 
for rlap > 6, the recovered fraction of SN lb dominates 
over SNe Ic, with less than 25% confusion with other su- 
pernova types. For rlap > 13, only Type lb templates 
correlate with the input spectrum. 

6.4.2. SN lib versus SN Il/Ib 

Some supernovae "evolve" from one typ e to another as 
they age. Such is the case for SNe lib (jWooslev et al.l 
11987') ■ whose early-time spectrum is characterized by 
prominent P Cygni lines of the hydrogen Balmer se- 
ries (as in SNe II), but which later develop conspicuous 
lines of He I, as in SNe lb. A p roto-typical example 
of such a supernova is SN 1993J (iNomoto et all [l993l : 
iMatheson et al. 2000). The SNID database currently 
contains spectra for three SNe lib: SN 1993J, SN 2000H 
and SN 2006T (Table [J). 

In Fig. [24] we study the fraction of templates corre- 
sponding to Type lib {solid line), Type II {dashed line), 
and Type lb {dotted line), when the input spectrum is 
a Type lib at low redshift {z — 0.1), within 15 days 
past explosion. At these ages, the He i lines are some- 
what weaker than after maximum and the confusion with 
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Fig. 23. — Attempt to classify an SN lb at z = 0.1 in the age 
interval —10 < 5~ +10. The lines correspond to fractions of 
matching templates of Type lb {solid line), Type Ic {dashed line) 
and all other types {dotted line), in rlap bins of size unity. The 
Type lib subclass, included in both the SN lb and SN II categories, 
is excluded for this purpose. The right panel shows representative 
spectra of normal SN lb and Ic around maximum light. Note that 
the relative differences in pseudo-continuum shapes have no impact 
on the SNID results. [See the electronic version of the Journal for 
a color version of this figure.] 



IIb,z = 0.1,t,^< +15d 



S 

H 



1.0 


1111,1111,111 

: . - - , /lib 


^111,1111,1111,1 

;iibN\ 1 


11,11. 


0.8 


\ j 






IIP \ / " 






0.6 


\ j 
\ / 

V 






0.4 


/ ^ 






0.2 


/ ^ 
/ ^ 






0.0 


/ lb \ 

■ HT-r-T<i--;--., , , N , , , ■ 







5 10 4000 5000 6000 7000 

rlap Rest Wavelength [A] 

Fig. 24.— Same as Fig. [23] but for an SN lib at 2 = 0.1 in the 

age interval tcxp < +15 (tcxp is the number of days past explosion). 
The lines correspond to fractions of matching templates of Type 
lib {solid line), Type II {dashed line) and Type lb {dotted line), in 
rlap bins of size unity. [See the electronic version of the Journal 
for a color version of this figure.] 

SNe II is greater: only for rlap > 7 is the fraction of re- 
covered lib templates greater than ordinary SNe II. The 
confusion with Type lb templates is small (< 20%) up 
to rlap w 9 and null for larger values of rlap. Tem- 
plates corresponding to SNe la and Ic do not correlate 
well with input Type lib spectra and the confusion is 
practically nonexistent (< 5%, not shown here). Again, 
more SNe lib are needed to truly test the ability for SNID 
to correctly identify them. 

6.5. Using SNID for SN Identification 

The previous examples illustrate the ability of SNID 
to recover a significant fraction of supernovae in the 
database corresponding to the input SN type. While 
Figs. [20H24l are informative, they do not provide a unique 
answer to the following: how does one use SNID to de- 
termine the type of an SN spectrum and can one relate 
the rlap quality parameter to a formal confidence that 
an input spectrum is of a certain type (and, sometimes 
more importantly, that it is not of another type)? The 
answer is far from settled, and its resolution will proba- 
bly involve a more sophisticated Bayesian approach (see 
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next section). Nevertheless, we have tested the following 
classification schemes: 

1. The SN type (subtype) is simply determined as the 
type (subtype) of the best-match template(s) for 
rlap greater than some cutoff value rlapj„j„. 

2. The SN type (subtype) is the one corresponding to 
the highest fraction of templates corresponding to 
a given type (subtype) for rlap > rlap^j^, with the 
possible additional requirement that this fraction 
exceeds some cutoff. 

3. The SN type (subtype) is the one whose fraction 
increases most with increasing rlap (i.e. the lines 
shown in Figs. [^DHMl that have the highest posi- 
tive gradient for rlap > rlapj^^j^), with the possible 
additional requirement that this gradient exceeds 
some cutoff. 

The first of these is the one commonly used for the 
identification of supernovae, at both low and high red- 
shift. In lAU circulars, a supernova is announced to be of 
a given type when it is "(most) similar to supernova X at 
N days from maximum light." In high-z SN la searches, 
a secure type is determined when a given spectrum is suf- 
ficien tly similar to a nearby SN la fe.g.. iMatheson et al.l 
[2005h . but it is never clear how different it is from su- 
pernovae of other types. However, the best-match tem- 
plate is not always the best indicator of the SN type (see 
the distinction between post-maximum spectra of 1991T- 
like SNe la and other SNe la in Fig. [201) and the large 
spectral database used in SNID offers the possibility to 
use the statistical power of correlations exceeding a cer- 
tain rlap cutoff. This second classification scheme has 
already been used for the classi fication of high-z SN la 
spectra using SNID (.Miknaitis et al.ll2007f ). A drawback 
of such an approach is that the determination of the SN 
type depends on the completeness of the SN database — 
which comprises few template spectra of core-collapse 
SNe (Ib,Ic, and II; see §|4]). Thus, for instance, it is not 
possible to identify peculiar SNe la this way (Fig. W^ . 
Last, the "gradient method" for classification is generally 
more robust, but also requires some other constraint on 
either the template fraction or the type of the best-match 
template(s). 

We are extensively testing these different combina- 
tions, to optimize the type determination for all SN 
types, at varying z, t, and S/N, although we suspect that 
a more elaborate Bayesian treatment will be required to 
properly account for the probability of an input spectrum 
to be of a certain type (as well as not being of some other 
type). 

7. COMPARISON WITH OTHER METHODS AND 
FURTHER IMPROVEMENTS 

Several other methods are used to determine the type, 
redshift, or age of a supernova spectrum. We briefly 
describe them here, with a distinction between cross- 
correlation and "x^-based" methods. Last, we com- 
ment on the alternative Bayesian approach to supernova 
classification — only applied so far to photometric mea- 
surements and its possible implementation in SNID. 

Other spectral classification methods involve principle 
component analysis (PCA), possibly in combination with 



artificial neural networks. These are beyond the scope of 
this paper and we do not discuss them here, although the 
PCA met hod has alrea dy been applied to SN la classifi- 
cation bv lJames et al.l (j2006). 

7.1. Cross-correlation Methods 

We are aware of two other a lgorithms based on th e 
cross-correlation techniques of iTonrv fc David (|1979f) . 
Both are aimed at determining redshifts of galaxies (or 
stars) in large surveys, but could easily be tuned (e.g., by 
modifying the shape of the bandpass filter and including 
age inform ation) to supernovae as in SNID. 

XCSAO ijKurtz fc Minklll998f) is a program part of the 
IRAF^ RVSAO package, aimed at obtaining redshifts 
and radial velocities from digital spectra. It has been 
used extens ively in the pas t for redshift surveys (for ref- 
erences see iKurtz fc Mi^ [T998) and is currently used 
in the Smithsonia n Hectospec Lensing Survey (SHELS; 
iGeller et al.l 120051 ). The basic algorithm is the same as 
that described in § [21 although some important differ- 
ences exist. First, the overlap in wavelength between 
the input and template spectra at the correlation red- 
shift (the lap parameter discussed in § 13. 2p is not taken 
into account explicitely; rather it is maximized by in- 
cluding spectral templates at different redshifts. Second, 
XCSAO directly selects the best peak in the correlation 
function, rather than looking at the 10 best peaks indi- 
vidually. Last, the reported correlation redshift is that 
associated with the best- match template (i.e., the one 
with the highest correlation height- noise ratio, r), rather 
than being the median of all redshifts above a certain 
cutoff rmin. It may be that not including the lap param- 
eter is less important for galaxy redshift determination, 
due to the narrower widths of spectral lines in galaxy 
spectra and the iterative scheme implemented in XCSAO 
ensures that an optimal correlation redshift is computed 
(D. Mink 2007, private communication). For supernova 
spectra, however, the inclusion of the lap parameter is 
fundamental to obtain reliable redshifts (see § 15. 2p . The 
other two differences are of a less fundamental nature. 

Another algorithm ba sed on the correlation techni ques 
of lTonrv fc Davii (fT979l) is RUNZ (iSaunders et al.l20 041. 
a progr am used by member s of the 2dF Galaxy Redshift 
Survev (iColless et al.|[200l[) and the 6dF Galaxy Survey 
(| Jones et al.ll2004l ). An essential difference with SNID is 
the scaling of the input spectrum by its inverse-variance 
(see ii l2.2p . which leads to improved cross-correlation red- 
shifts for galaxy spectra. As mentioned earlier in ii l2.21 no 
such improvement is found for supernova redshifts, since 
the power spectrum of a typical variance spectrum (for 
ground-based observations) peaks at higher wavenum- 
bers than for SN spectra. One advantage of RUNZ over 
SNID is the implementation of Gaussian constraints on 
redshift, a feature that will be part of a future version of 
SNID. 

7.2. Methods 

An alternative to cross-correlation techniques involves 
the minimization of a x^-like quantity at discrete red- 

^ IRAF is distributed by the National Optical Astronomy Obser- 
vatories, operated by the Association of Universities for Research 
in Astronomy, Inc., under contract to the National Science Foun- 
dation of the United States. 
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shift intervals, to find the best match between an in- 
put and template spectrum. Such techniques do not en- 
able a formal evaluation of the redshift error as in SNID 
(see § 13. 4p . An elabo rate implementa tion of this ap- 
proach is "superfit" (Howell ct al. 2005) , in which an in- 
put spectrum is compared to a combination of a (possibly 
reddened) template supernova spectrum and a template 
galaxy spectrum at a given redshift . A similar program 
to superfit is SAf-Ht dSaintonl 12004 see also discussion 
in lBlondin et al.ll2005D . The number of free parameters 
means that the execution time is 2-3 orders of magnitude 
greater than for SNID. For this reason we could not test 
superfit in the same way we conducted the simulations 
presented in the previous sections. 

The advantage of this technique lies in the evalua- 
tion of the fraction of galaxy light in the input spec- 
trum and its subsequent subtraction to obtain a "pure" 
supernova spectrum. Consequently, supernovae can be 
classified even when the galaxy co ntamination fraction is 
high (< 75%; iH owell et all 12005'). Given the impact of 
galaxy contamination on correlation redshifts (Fig. [T5|). 
we expect more accurate redshifts when this contamina- 
tion is removed from the input spectrum. However, the 
typical errors on redshift are similar to those in SNID 
()Hook et al.ll2005l ). This suggests that the redshift ac- 
curacy is limited by physical properties of supernovae 
(namely, the velocity location of their prominent spec- 
tral features) rather than by differences in algorithm. 
The accuracy on the age determination is similar to that 
in SNID, with a ~ 3-day dispersion about the one-to- 
one correspondence with the light-curve age ijHook et al.l 
I2005t IHowell et aI1l2005D . We note that the current ver- 
sion of superfit has not been extensively tested for red- 
shift and age determination and has not been tested at 
all for type determination (A. Howell 2006, private com- 
munication) . 

A variant of the approach implemented i n superfit is 
the sp ectral feature aging (SFA) algorithm of lRiess et al.l 
()1997l ). used to determine the age of a normal SN la of 
known redshift. In SFA, the input spectrum is divided 
up into several wavelength intervals (typically eight) and 
each of these spectral regions is compared with corre- 
sponding ones in a database of template spectra. The 
age accuracy is similar to that in SNIDbut is largely 
sensitive to the wavelength intervals of the spect ral re- 
gions used to divide up the input spectrum (Folcv ~t al.l 
|2005| ). Another tool, based on the SFA algorithm, plans 
to extend the age deterrn ination to spectra of all types 
(|Harutvunvan et al.ll2005l ). 

7.3. Bayesian Approach to SN Classification 

Recently, several authors have presented Bayesian 
methods to determine the type of a supernova based 
on a single-epoch p hotometric measurement (potentiall y 
in multiple bands; iPoznanski. M aoz. fc Gal-YamI I2006D . 
or on multiband light curves feuzncts ova fc Connollvl 
120071: iKunz et al.l 120061) . The motivation behind these 
purely photometry-based approaches is the planned next- 
generation of wide-field all-sky surveys (such as Pan- 
STARRS and LSST), for which many SN la candi- 
dates will be too faint for spectroscopy. However, these 
techniques are in principle applicable to ongoing high- 
redshift SN la surveys, which are limited by the a.mount 
of available spectroscopy time (|Matheson et al.l 120051 : 



IHowell et al]|2005[ ). In fact. iTonrv et al.l (|2003D already 
adopt a Bayesian approach in fitting high-z SN la light 
curves with BATM (Bayesian Adapted Template Match; 
iTonrv et al.ll2003f) . 

These Bayesian-based approaches assign a probabil- 
ity for a supernova to be of a certain type, based on 
a set of measurements (e.g., light curve points in a given 
photometric band) and given a model — or template, 
th at depends on a set of pararn eters. As pointed out 
bvlKuznctsova fc Connollvr()2007f ). when computing this 
probability it is generally assumed that the input is in- 
deed a supernova of a known type, although in princi- 
ple one could extend the formalism to incorporate all 
known astrophysical objects. Moreover, such methods 
invoke "marginalization over type," which poses some 
conceptual problems, since it assumes the SN classifica- 
tion scheme to be both complete (i.e. to include all pos- 
sible SN types) and hermetic. Yet, there appears to be a 
continuum of properties relating different SN types (e.g.. 
Type lb and Ic), some supernovae evolve from one type 
to another (e.g.. Type lib) and others still seem to defy 
classification (such as the "peculiar SNe la" SN 2002cx 
or SN 2005hk). 

The current version of SNID does not incorporate such 
a methodology in its supernova classification. Neverthe- 
less, the change in the relative fraction of templates of 
a given type as a function of the rlap quality parameter 
could be folded in as an extra constraint on the SN type 
in a Bayesian framework. 

8. CONCLUSION 

We have presente d an algorithm, based on the corre- 
lation techniques of iTonrv fc Davia (|1979f) . that can be 
used to determine the redshift and age of a supernova 
spectrum and place constraints on its type. We develop 
a diagnostic, the rlap quality parameter, to quantify the 
reliability of a given correlation between the input and a 
template spectrum. This param eter is simply the prod- 
uct of the lTonrv fc Davii ()1979f ) correlation height-noise 
ratio (r) and the overlap in rest-frame In I space between 
the input and template spectrum at the correlation red- 
shift (lap). This rlap diagnostic has the advantage of 
enabling the formal computation of the redshift error, 
proportional to 1/(1 + rlap). We show, based on simu- 
lations, that for rlap > 5, the typical error on redshift 
and age is ct^ < 0.01 and CTt < 3 days, respectively. The 
former accuracy on redshift is confirmed through a com- 
parison of correlation redshifts with host-galaxy redshifts 
(determined from narrow lines in the spectrum) out to 
redshifts z < 0.8. The latter accuracy on age is confirmed 
through a comparison of (rest frame) spectral ages using 
SNID and (observer frame) light curve ages corrected for 
the (1 + z) time-dilation factor expected in an expanding 
universe. The fact that both age estimates agree so well 
is itself a verification of the cosmological nature of red- 
shifts, previ ously tested with multiepoch SN la spectra 
(|Riess et al.l 119971: iFolev et al.ll2005[) . Furthc rmore, the 
success of SNID in determining the redshift and age of 
the high-redshift SN la spectra in our sample shows that 
these are similar to low-redshift counterparts. 

We present first results of an impartial and effective 
spectroscopic classification of supernovae, based on the 
fraction of correlations exceeding a certain rlap cutoff. 
We illustrate this through various examples, three of 
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which are relevant to ongoing SN la searches at high 
redshift: we are able to distinguish 1991T-like SNe la 
from other SNe la at z ~ 0.5; we identify SNe Ib/c as 
such at both z = 0.3 and z = 0.5. The identification 
of peculiar SNe la, on the other hand, proves easier at 
z = 0.5 than z = 0.3, although this result remains to be 
verified with more peculiar SN la spectral templates. In 
all cases we assume the galaxy contamination fraction, 
to which SNID is not sensitive, to be negligible. 

These examples both illustrate the success and limi- 
tations of such an automated classification scheme and 
highlight the complementarity between spectroscopic 
and photometric observations in determining the super- 
nova type. We are currently testing various combinations 
of other classification schemes to improve the classifica- 
tion of all SN types using SNID. 

Supernova discoveries will continue to increase dramat- 
ically with the advent of wide-field imaging telescopes op- 
timized for the detection of transient events, such as Pan- 
STARRS and LSST. These experiments are expected to 
find tens to hundreds of thousands of new supernovae 
each year, few of which will have spectroscopic confirma- 
tion. Thus, most identifications will have to rely solely 
on photometric properties, a difficult task in view of 
the present difficulty of distinguishing between SN types 
(and subtypes) with spectra. It is likely that those ex- 
periments will have to rely on a simpler classification 
scheme, focusing on the main SN types (Type la, Ib/c, 
& II) with little or no distinction between the associated 
subtypes. 

Future planned space-based high-redshift SN la sur- 
veys within the NASA/DOE Joint Dark Energy Mission 
(ADEPT, DESTINY, SNAP) will incorporate a spectro- 
graph and could benefit from a tool such as SNID. A 
secure identification of SNe la in such experiments will 
require sufficient rest frame wavelength coverage beyond 
'-^ 5500 A, as the distinction between SNe la and Ic (and 
between SN la subtypes, including peculiar events) is 
otherwise problematic fPigs. [20l - [22)) . 

The current version of SNID will be made avail- 
able to the community and we will set up a Web- 
based interface for instantaneous supernova typing (in- 
cluding redshift and age determination). Future ver- 



sions of SNID will include a wavelength- and age- 
weighted spectrum overlap parameter, lap(Z,i), an ex- 
plicit treatment of the covariance between redshift 
and age and a Bayesian approach to type determi- 
nation, as current ly used for photometric classifica- 
tion of supernovae (iPoznanski. Maoz, fc Gal- Yam 20.0^; 
iKuznetsova fc Connollvll2007f ). Moreover, more spectral 
templates are continuously being included in the SNID 
database through the CfA Supernova Program (more 
than 3000 spectra of over 700 supernovae since 1997), 
which directly impact the ability of SNID to securely 
identify input spectra. This further enables compara- 
tive studies of SN spectra and quantitative evaluation of 
synthetic spectral fits to observations. 
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TABLE 1 
SNID Supernova Database 



lAU Name 


Subtype 


Ages 


Ref. 


(1) 


(2) 


(3) 


(4) 


1981B 


la-norm 


0,17,20,24 


1 


1986G 


Ia-91bg 


-3,-2,0-2,30+(4) 


1 


1987A 


II-pcc 


2,4-9, ll-27,31-39,40+(69) 


2,3 


1989B 


la-norm 


-6,-1,4,6,8,10,12(2), 13,14,16-25,30-l-(3) 


1 


1990B 


Ic-norm 


5,6(2), 7,9,10,15,28{2),30+(6) 


4,5,CfA 


19901 


Ib-norm 


ll,19,30+(6) 


6 


1990N 


la-norm 


-13(2), -6,3,5,15,18,30-^(5) 


7,8 


1990O 


la-norm 


-[7-5],0,19,20 


CfA 


1991M 


la-norm 


25,26,30-t-(l) 


CfA 


1991T 


Ia-91T 


-12,-10,-9,-[7-5],0,19,30-|-(3) 
2,3(3), 15,16,19(2),20,26,30-|-(16) 


9-11 


1991bg 


Ia-91bg 


12,14 


1992A 


la-norm 


-5(2), -1,0,2,3,6(2),7,9(2), 12(2), 16,17,24,28 


15 


1992H^ 


IIP 


16,29,40+(8) 


16,17 


1992ar 


Ic-norm 


3 


CfA 


1993J 


lib 


3,4(2), 5, 11, 16, 17,18, 19(3), 22,24(2),25(2),26-28,32-34,38,40+(51) 


18-21 


1993ac 


la-norm 


7 


CfA 


1994D 


la-norm 


-11(2), -10(2), -9,-8,-6,-5(2), -4(2), -3,-2,0,2,3,10-12, 13(2), 14,15(3), 16,17(2), 19,24,26,30-|-(11) 
-6(2),-4,-3,l,2(2),3,21-24,26,30+(5) 


22,23,CfA 


19941 


Ic-norm 


24 


1994M 


la-norm 


3-5,8,13,14 


CfA 


1994Q 


la-norm 


19 


CfA 


1994S 


la-norm 


-3(2), 1 


CfA 


1994T 


la-norm 


1 


CfA 


1994ae 


la-norm 


l,2,3(2),4,6,9(2)10,ll,30-|-(7) 


CfA 


1995D 


la-norm 


4,6,8,10,11, 14,16,30-l-(3) 


CfA 


1995E 


la-norm 


-2,0,2,7,10,30-^(1) 


CfA 


1995ac 


la-norm 


24 


CfA 


1995al 


la-norm 


17,25 


CfA 


1995bd 


la-norm 


ll,21,30-|-{2) 
8 


CfA 


1996C 


la-norm 


CfA 


1996X 


la-norm 


-3,-2,-l(2),0,l(2),2(2),3,5-7,8(2),9,13,21,23,25,30+(l) 


25, CfA 


1997br 


Ia-91T 


-9,-8,-7(2), -6(2), -4,8,9,12,17,18,21,24,30+(6) 


26, CfA 


1997cn 


Ia-91bg 


4,29,30-l-(l) 


27,CfA 


1997do 


la-norm 


-11,-10,-7,-6,9,11,12,13,15,16,20,21 


28 


1997dt 


la-norm 


-[10-7], -4, 1,3 


28 


1997ef 


Ic-hyper 


-14,-12, -[11-9], -6,-5(2), -4,7,13,14,16,17,19,22,24,27,30+(4) 


29 


1998S 


Iln 


5,6, 17,19, 20(2), 21, 28,30-32,34,40-|-(44) 


30-32 


1998V 


la-norm 


l-3,13,14,15,30+(3) 


28 


1998ab 


Ia-91T 


-7,7,8, 18, 19,20, 21, 22, 23,30+(3) 


28 


1998aq 


la-norm 


-9,-8,0-7,19,21, 30+(15) 


28,33 


1998bp 


Ia-91bg 


-2,-l,0-2,13,15,25,26,28,30+(l) 


28 


1998bu 


la-norm 


-[3-l].l,9-14,28,29,30+(21) 


28,34 


1998bw 


Ic-hyper 


8,9, 12-14,16,18,19,21, 24,26-28,30-|-(8) 


35 


1998de 


Ia-91bg 


-[7-5], -3,-2,0,3 


28 


1998dh 


la-norm 


-[9-7],-5,-3,0,30-l-(4) 


28 


1998dk 


la-norm 


10,11, 13,16,18,21,23,30-|-(2) 


28 


1998dm 


la-norm 


5,6,8,11, 13,16,18,25,30-|-(2) 


28 


1998dt 


Ib-norm 


0,1,4,7,11,12,17 


CfA 


1998ec 


la-norm 


-2,-l,13,21,27,30-|-(l) 


28 


1998eg 


la-norm 


0,5,18,20,23 


28 


1998es 


Ia-91T 


-[10-l],l-3,16,18,19,20,24,26,30-t-(7) 


28 


1999X 


la-norm 


12,13,15,16,21,29 


28 


1999aa, 


Ia-91T 


- [9-1], 1, 15-18, 27-29,30-l-(ll) 


28 


1999ac 


Ia-91T 


-4,-3,-1,9-12,25, 27,29,30-t-(9) 


28 


1999by 


Ia-91bg 


-[5-2], 2-8,11, 25,29,30-l-(3) 


28,36 


1999CC 


la-norm 


-3,-1,0,2,19,24,25 


28 


1999cl 


la-norm 


-8,-7,-[5-l],l,8,30+(l) 


28 


1999dq 


Ia-91T 


-[10-2],l-4,6,18,19,27,30-|-(6) 


28 


1999ee 


la-norm 


-9,-7,0,2, 7,11, 16,20,22,27,30-t-(2) 


28 


1999ej 


la-norm 


-1,2,4,9,12 

6(2), 7-9,10(2), 11,12,15,16(3), 17,19,21,26,37,40+(27) 


28 


1999em 


IIP 


37-39 


1999cx'' 


Ib-norm 


-5,0,9 


40 


1999gd 


la-norm 


3,9,27,30+(2) 


28 


1999gh 


Ia-91bg 


5-9,ll,12,28,30-l-(7) 


28 


1999gi 


IIP 


5,7,8,31,36,39,40-l-(5) 


41 


1999gp 


Ia-91T 


-5,-2,0,3,5,7,22,30+(3) 
9,14,22,30-^(2) 


28 


2000B 


la-norm 


28 


2000E 


la-norm 


-6,-3,-l,8,30+(l) 


42 


2000H 


lib 


28,29(2),31-34,40-|-(5) 


43,CfA 


2000cf 


la-norm 


3,4,15,17,25,26 


28 


2000cn 


Ia-91bg 


-[9-7],9,ll,13,22,26,27,30-|-(l) 


28 


2000CX 


la-pcc 


-[3-1] ,0-2,6-8, 10,12, 15, 19,22,24,26,28,30-|-(9) 


44 


2000dk 


Ia-91bg 


-5,-4,l,4,10,30+(3) 


28 


2000fa 


la-norm 


-10,-9,2,3,5,9,11, 14,16,18,21, 30+(3) 


28 


2001V 


la-norm 


-[13-9], -[7-5], -3,10,11(2), 13,14,18,19,20(2),21(2),22-24,27,28,30-|-(13) 


28 
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TABLE 1 
SNID Supernova Database 



2002ap 

2002bo 

2002CX 

2002er 

2003du 

2004ao 

2004aw 

2004ct 

2004gt 

2005bf: 

2005CS 

2005hg 

2005hk 

2005mf 

2005mz 

2006T 

2006aj 



Ic-norm 



Ib-norm 
la-pcc 



Ic-norm 

Ib-pcc 

IIP 



Ia-91bg 
lib 

Ic-broad 



la-norm 
la-norm 
Ib-norm 
Ic-norm 
IIP 



Ic-hyper 
la-norm 
la-pec 



-6,-5, -2, -1,0-2,4-6, 7(2), 10,12, 26,30+(5) 

-13(2), -12(2), -[9-6], -5(2), -4(2), -3(3), -2(2), -1(3),0,6, 11-22, 24,28, 29(2), 30-|-(12) 
-5, -2, 10,14,18,23, 24,30+(l) 

-11, -[9-5], -4(2), -[3-1], 2-4,6,9,11, 12,16(2),20,24,30+(2) 

-12,-10(3), -[9-5], -3,-l(2),0-2,3(2),4,5,8-ll, 14,17-19, 20(2),22(2),24,26,27,29,30-t-(24) 
7-13,16,17,20-23,30-l-(16) 

-5,-3,-2(2)-l,0,2(2),3-5,12,18(2),19,21,23,25,26,30-f(6) 

13-15,17,18,20,22,26,28,40-(-(15) 

15, 17,21, 23,30+(8) 

-4,-2, -1,0,2, 16, 18-23,25-27,29,30+(6) 

7-14,16-19,35,36,40-(-(l) 

-[13-1], 0,13, 17,27 

-9-8(2), -7(2), -6(3), -5(2), -4(3), -3(2), -2, -1,4,12,14(2), 20,23(2), 26,27,30+(6) 

-4,-3,3,6 

-7,11,13,19 

8,10,28,36 

-5,-3,-2,-1,0,2,3 



47 
48 

49,50 
CfA 



45, CfA 

46, CfA 



55,CfA 



CfA 
CfA 
CfA 
56 



CfA 
CfA 
52 

53,54 
CfA 



51, CfA 



References. — (1) 'Wells ct aL"1994'; (2) 'Phillips ct al."1988"; (3) 'Phillips ct al."1990'; (4) 'Mathcson ct al."2001'; (5) 'Clocchiatti ct al."2001l: 
(6)IElmhamdi et al. ]|2b04: (7) LcibundKut ct al. 1991; (8) Mazzali ct al. 1993; (9) Jcffcry ct al. 1992; (10) Schmidt ct al. 1994: (11) Mazzali ct al.l 
11995'; (12) LcibundKut ct al. 1993; (13) Turatto ct al. 1996; (14) Gomez ct al. 1996; (15) Kirshncr ct al. 1993; (16) Clocchiatti ct al. 1996; (17) 
[Gomez &; Lopez 2000; (18) Jeffcrv et al. 1994; (19) Barbon et al. 1995; (20) Mathcson et al. 2000; (21) Fransson ct al. 2005; (22) Hofiich 1991; 
(23) Patat ct al. 1996; (24) Millard ct al. 1999; (25) Salvo et al. 2001; (26) Li et al. 1999; (27) Turatto ct al. 1998; (28) Mathcson et al. 2007; (29) 
Rwamoto ct al. 2000; (30) Lcntz et al. 2001; (31) Fassia ct al. 2001; (32) Fransson ct al. 2005; (33) Branch ct al. 2003; (34) Jha ct al. 1999; (35) 
|Patat ct al. 2001; (36) Garnavich et al. 2004; (37) Baron ct al. 2000; (38) Hamuv ct al. 2001; (39) Leonard et al. 2002a; (40) Hamuv et al. 2C )0a 
(41) Leonard et al. 2002b; (42) Valcntini et al. 2003; (43) Branch et al. 2002; (44) Li et al. 2001a; (45) Gal- Yam ct al. 2002; (46) Bcnctti ct^ 
[2004; (47) Li et al. 2 003; (48) Kotak c t al. 2005; (49) Anupama ct al. ,2Q05; (50) Stanishcv et al, ^ 2007.: (51) .'Taubcnbergcr et al.. ,2006. ; (52) 
iTominaga et al.ll20o"i ; (53) IBrown et al.ll2007l ; (54) IDessart et al.ll2007t (^5) IPliillips etral.|[20o'7l ; (56) iModiaz et al.ll2006l 

Column headings: (1) lAU designation. (2) Supernova subtype, as defined in Table ^ (3) Rest-frame SN age, rounded to closest whole day, 

in days from i?-band maximum (for SN la), from V^-band maximum (for SN Ib/c), or from the estimated date of explosion (for SN II). Adjacent 
ages are listed in between square brackets; a "(n)" indicates that n spectra correspond to a same rounded age. Spectra of SN la/b/e whose age 
exceeds +30 days arc grouped together, e.g. 30+(5) indicated there arc 5 spectra with ages > +30 days (past maximum); same for spectra of 
SN II whose age exceeds 40 days (past explosion). (4) Reference of refereed articles presenting optical spectroscopic data (see "References" below); 
"CfA" refers to unpublished spectra obtained by members of the CfA SN Group, some of which are available via the CfA Supernova Archive 
l |http: //www, cf a. harv ard ■ edu/supernova/SNa r chive .html[ l. ^ The light curve of SN 1992H exhibited a truncated plateau (IClocchiatti et al.lfT996l '). 
but its spectra are otherwise indistinguishable from other Type IIP supernovae. ' IHamuv et al.1 I I2002D classify SN 1999ex as an intermediate Ib/c 
event, while [Parrent et al.| j2007D support the lb classification, highlighting the similarity with the peculiar SN lb 2005bf jTominaga et aril2005|) . 
We classify SN 1999ex as a normal Type lb supernova and note that the essential spectroscopic peculiarity of SN 2005bf (namely the increasing 
absorption velocity of the He 1 15876 line; ITominaga et al^|2005^ is not present in SN 1999ex. The V-hand light curve of SN 2005bf had two 
distinct maxima. The age is expressed in days from the first maximum. 



